A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » Processors » AMD x86-64 Processors
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Roadrunner Supercomputer using 12,960 CELL Processors Hits 1 PetaFlop(1000 TeraFlops) of double-precision FP Performance



 
 
Thread Tools Display Modes
  #1  
Old June 9th 08, 01:07 PM posted to comp.sys.super,comp.arch,alt.comp.hardware.amd.x86-64,comp.sys.ibm.pc.hardware.chips,alt.games.video.sony-playstation3
AirRaid
external usenet poster
 
Posts: 51
Default Roadrunner Supercomputer using 12,960 CELL Processors Hits 1 PetaFlop(1000 TeraFlops) of double-precision FP Performance

Yes, this is cross-posted, because IMO it deserves to be. This is a
milestone in computing performance, thus, IMO, it should be as widely
posted as possible.

This isn't spam whatsoever, so if you want to get ****ed that it's a
cross post, okay,
just don't think it's spam.

I posted about the Roadrunner supercomputer back in September 2006
when it was announced. Now it's up and running.




http://graphics8.nytimes.com/images/...op.enlarge.jpg

Military Supercomputer Sets Record

By JOHN MARKOFF
Published: June 9, 2008

SAN FRANCISCO — An American military supercomputer, assembled from
components originally designed for video game machines, has reached a
long-sought-after computing milestone by processing more than 1.026
quadrillion calculations per second.


The Roadrunner supercomputer costs $133 million and will be used to
study nuclear weapons.

The new machine is more than twice as fast as the previous fastest
supercomputer, the I.B.M. BlueGene/L, which is based at Lawrence
Livermore National Laboratory in California.

The new $133 million supercomputer, called Roadrunner in a reference
to the state bird of New Mexico, was devised and built by engineers
and scientists at I.B.M. and Los Alamos National Laboratory, based in
Los Alamos, N.M. It will be used principally to solve classified
military problems to ensure that the nation’s stockpile of nuclear
weapons will continue to work correctly as they age. The Roadrunner
will simulate the behavior of the weapons in the first fraction of a
second during an explosion.

Before it is placed in a classified environment, it will also be used
to explore scientific problems like climate change. The greater speed
of the Roadrunner will make it possible for scientists to test global
climate models with higher accuracy.

To put the performance of the machine in perspective, Thomas P.
D’Agostino, the administrator of the National Nuclear Security
Administration, said that if all six billion people on earth used hand
calculators and performed calculations 24 hours a day and seven days a
week, it would take them 46 years to do what the Roadrunner can in one
day.

The machine is an unusual blend of chips used in consumer products and
advanced parallel computing technologies. The lessons that computer
scientists learn by making it calculate even faster are seen as
essential to the future of both personal and mobile consumer
computing.

The high-performance computing goal, known as a petaflop — one
thousand trillion calculations per second — has long been viewed as a
crucial milestone by military, technical and scientific organizations
in the United States, as well as a growing group including Japan,
China and the European Union. All view supercomputing technology as a
symbol of national economic competitiveness.

By running programs that find a solution in hours or even less time —
compared with as long as three months on older generations of
computers — petaflop machines like Roadrunner have the potential to
fundamentally alter science and engineering, supercomputer experts
say. Researchers can ask questions and receive answers virtually
interactively and can perform experiments that would previously have
been impractical.

“This is equivalent to the four-minute mile of supercomputing,” said
Jack Dongarra, a computer scientist at the University of Tennessee who
for several decades has tracked the performance of the fastest
computers.

Each new supercomputing generation has brought scientists a step
closer to faithfully simulating physical reality. It has also produced
software and hardware technologies that have rapidly spilled out into
the rest of the computer industry for consumer and business products.

Technology is flowing in the opposite direction as well. Consumer-
oriented computing began dominating research and development spending
on technology shortly after the cold war ended in the late 1980s, and
that trend is evident in the design of the world’s fastest computers.

The Roadrunner is based on a radical design that includes 12,960 chips
that are an improved version of an I.B.M. Cell microprocessor, a
parallel processing chip originally created for Sony’s PlayStation 3
video-game machine. The Sony chips are used as accelerators, or
turbochargers, for portions of calculations.

The Roadrunner also includes a smaller number of more conventional
Opteron processors, made by Advanced Micro Devices, which are already
widely used in corporate servers.

“Roadrunner tells us about what will happen in the next decade,” said
Horst Simon, associate laboratory director for computer science at the
Lawrence Berkeley National Laboratory. “Technology is coming from the
consumer electronics market and the innovation is happening first in
terms of cellphones and embedded electronics.”

The innovations flowing from this generation of high-speed computers
will most likely result from the way computer scientists manage the
complexity of the system’s hardware.

Roadrunner, which consumes roughly three megawatts of power, or about
the power required by a large suburban shopping center, requires three
separate programming tools because it has three types of processors.
Programmers have to figure out how to keep all of the 116,640
processor cores in the machine occupied simultaneously in order for it
to run effectively.

“We’ve proved some skeptics wrong,” said Michael R. Anastasio, a
physicist who is director of the Los Alamos National Laboratory. “This
gives us a window into a whole new way of computing. We can look at
phenomena we have never seen before.”

Solving that programming problem is important because in just a few
years personal computers will have microprocessor chips with dozens or
even hundreds of processor cores. The industry is now hunting for new
techniques for making use of the new computing power. Some experts,
however, are skeptical that the most powerful supercomputers will
provide useful examples.

“If Chevy wins the Daytona 500, they try to convince you the Chevy
Malibu you’re driving will benefit from this,” said Steve Wallach, a
supercomputer designer who is chief scientist of Convey Computer, a
start-up firm based in Richardson, Tex.

Those who work with weapons might not have much to offer the video
gamers of the world, he suggested.

Many executives and scientists see Roadrunner as an example of the
resurgence of the United States in supercomputing.

Although American companies had dominated the field since its
inception in the 1960s, in 2002 the Japanese Earth Simulator briefly
claimed the title of the world’s fastest by executing more than 35
trillion mathematical calculations per second. Two years later, a
supercomputer created by I.B.M. reclaimed the speed record for the
United States. The Japanese challenge, however, led Congress and the
Bush administration to reinvest in high-performance computing.

“It’s a sign that we are maintaining our position,“ said Peter J.
Ungaro, chief executive of Cray, a maker of supercomputers. He noted,
however, that “the real competitiveness is based on the discoveries
that are based on the machines.”

Having surpassed the petaflop barrier, I.B.M. is already looking
toward the next generation of supercomputing. “You do these record-
setting things because you know that in the end we will push on to the
next generation and the one who is there first will be the leader,”
said Nicholas M. Donofrio, an I.B.M. executive vice president.

By breaking the petaflop barrier sooner than had been generally
expected, the United States’ supercomputer industry has been able to
sustain a pace of continuous performance increases, improving a
thousandfold in processing power in 11 years. The next thousandfold
goal is the exaflop, which is a quintillion calculations per second,
followed by the zettaflop, the yottaflop and the xeraflop.

http://www.nytimes.com/2008/06/09/te...aflops.html?hp


other articles:

http://www.engadget.com/2008/06/09/w...eaks-petaflop/
http://www.theregister.co.uk/2008/06...omputer_debut/
http://www.itproportal.com/articles/...aflop-barrier/
http://www.theinquirer.net/gb/inquir...litary-achieve
  #2  
Old June 10th 08, 10:32 AM posted to comp.sys.super,comp.arch,alt.comp.hardware.amd.x86-64,comp.sys.ibm.pc.hardware.chips,alt.games.video.sony-playstation3
mr deo
external usenet poster
 
Posts: 166
Default Roadrunner Supercomputer using 12,960 CELL Processors Hits 1 PetaFlop (1000 TeraFlops) of double-precision FP Performance


"AirRaid" wrote in message
...
Yes, this is cross-posted, because IMO it deserves to be. This is a
milestone in computing performance, thus, IMO, it should be as widely
posted as possible.

This isn't spam whatsoever, so if you want to get ****ed that it's a
cross post, okay,
just don't think it's spam.

I posted about the Roadrunner supercomputer back in September 2006
when it was announced. Now it's up and running.


Nothing special about it tbh.
If we had moved into DNA, Chemical, or Optical processors to manage such a
task then I would say it is special..
The fact that they have managed to bolt on different types of processors
onto a board (think GPU, CPU, and MMPU) and then link the boards together
(Think NIC) really doesnt show any innovation.

I know my example isnt the same as what we have going on in these servers,
but in essence, anyone who wants to throw 400m into a project can see them
double what this kit is doing, so it's nothing but a monetary step.


  #3  
Old June 10th 08, 10:34 AM posted to comp.sys.super,comp.arch,alt.comp.hardware.amd.x86-64,comp.sys.ibm.pc.hardware.chips,alt.games.video.sony-playstation3
mr deo
external usenet poster
 
Posts: 166
Default Roadrunner Supercomputer using 12,960 CELL Processors Hits 1 PetaFlop (1000 TeraFlops) of double-precision FP Performance

"AirRaid" wrote in message
...
Yes, this is cross-posted, because IMO it deserves to be. This is a
milestone in computing performance, thus, IMO, it should be as widely
posted as possible.

This isn't spam whatsoever, so if you want to get ****ed that it's a
cross post, okay,
just don't think it's spam.

I posted about the Roadrunner supercomputer back in September 2006
when it was announced. Now it's up and running.


Nothing special about it tbh.
If we had moved into DNA, Chemical, or Optical processors to manage such a
task then I would say it is special..
The fact that they have managed to bolt on different types of processors
onto a board (think GPU, CPU, and MMPU) and then link the boards together
(Think NIC) really doesnt show any innovation.

I know my example isnt the same as what we have going on in these servers,
but in essence, anyone who wants to throw 400m into a project can see them
double what this kit is doing, so it's nothing but a monetary step.

(damn your HTML original post)


  #4  
Old June 10th 08, 05:19 PM posted to comp.sys.super,comp.arch,alt.comp.hardware.amd.x86-64,comp.sys.ibm.pc.hardware.chips,alt.games.video.sony-playstation3
Robert Myers
external usenet poster
 
Posts: 606
Default Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance

On Jun 10, 5:34 am, "mr deo"
wrote:


Nothing special about it tbh.
If we had moved into DNA, Chemical, or Optical processors to manage such a
task then I would say it is special..
The fact that they have managed to bolt on different types of processors
onto a board (think GPU, CPU, and MMPU) and then link the boards together
(Think NIC) really doesnt show any innovation.

I know my example isnt the same as what we have going on in these servers,
but in essence, anyone who wants to throw 400m into a project can see them
double what this kit is doing, so it's nothing but a monetary step.


It's hard to know what your perspective could be. The general
question is: what could you do if you coupled a high-end out-of-order
general-purpose CPU with a beefy compute engine and serious
interconnect. Cell (or similar compute-intensive hardware) has
important implications for power consumption. Balancing the memory,
the bandwidth of all the interconnects, the switches, and the compute
density of each node (how many processors? connected how?) is more
than just buying a bunch of Macs and hooking them together with one
gigabyte ethernet.

I'd like to see a serious discussion of this machine, but this thread
with all these cross-posts isn't the right place to be doing it. The
one petabyte crap is pure national labs PR. The interesting stuff is
elsewhere. I'm sorry if you can't see that.

Robert.

  #5  
Old June 11th 08, 10:22 AM posted to comp.sys.super,comp.arch,alt.comp.hardware.amd.x86-64,comp.sys.ibm.pc.hardware.chips,alt.games.video.sony-playstation3
[email protected]
external usenet poster
 
Posts: 23
Default Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance

On Jun 10, 7:19 pm, Robert Myers wrote:
On Jun 10, 5:34 am, "mr deo"
wrote:



Nothing special about it tbh.
If we had moved into DNA, Chemical, or Optical processors to manage such a
task then I would say it is special..
The fact that they have managed to bolt on different types of processors
onto a board (think GPU, CPU, and MMPU) and then link the boards together
(Think NIC) really doesnt show any innovation.


I know my example isnt the same as what we have going on in these servers,
but in essence, anyone who wants to throw 400m into a project can see them
double what this kit is doing, so it's nothing but a monetary step.


It's hard to know what your perspective could be. The general
question is: what could you do if you coupled a high-end out-of-order
general-purpose CPU with a beefy compute engine and serious
interconnect. Cell (or similar compute-intensive hardware) has
important implications for power consumption. Balancing the memory,
the bandwidth of all the interconnects, the switches, and the compute
density of each node (how many processors? connected how?) is more
than just buying a bunch of Macs and hooking them together with one
gigabyte ethernet.

I'd like to see a serious discussion of this machine, but this thread
with all these cross-posts isn't the right place to be doing it. The
one petabyte crap is pure national labs PR. The interesting stuff is
elsewhere. I'm sorry if you can't see that.

Robert.


I'd agree with mr deo.
Roadrunner interconnects look like a big step backward from other PR-
heavy American supercomputers. Hopefully, they are not as boring as in
Virginia Tech cluster that you mentioned, but not in the same class as
Cray/Sandia Red Storm, SGI/NASA Columbia or IBM/LLNL BlueGene/L.

Also it looks like programming model for Roadrunner (3 separate
architecture and many programmer-visible levels of memory hierarchy)
is much harder to use effectively than just about any big
supercomputer built up to date.

Last, but not least, performance per watt and per cubic meter seem
seriously worse than BlueGene.
  #6  
Old June 11th 08, 10:32 AM posted to comp.sys.super,comp.arch,alt.comp.hardware.amd.x86-64,comp.sys.ibm.pc.hardware.chips,alt.games.video.sony-playstation3
[email protected]
external usenet poster
 
Posts: 23
Default Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance

On the positive note, successful testing of the Roadrunner means that
IBM has the ability to manufacture a new variant of Cell with fully-
pipelined double-precision FPU in production quantity.
IBM web site indicates that a new engine is available to mere mortals:
http://www-03.ibm.com/systems/bladec...s22/index.html
  #7  
Old June 11th 08, 03:32 PM posted to comp.sys.super,comp.arch,alt.comp.hardware.amd.x86-64,comp.sys.ibm.pc.hardware.chips,alt.games.video.sony-playstation3
Air Raid
external usenet poster
 
Posts: 35
Default Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance

On Jun 11, 4:32 am, wrote:
On the positive note, successful testing of the Roadrunner means that
IBM has the ability to manufacture a new variant of Cell with fully-
pipelined double-precision FPU in production quantity.
IBM web site indicates that a new engine is available to mere mortals:http://www-03.ibm.com/systems/bladec...ers/qs22/index...




Interesting thing about the IBM PowerXCell 8i Processor is that it
offers 4 to 5 (IBM says 5) times the double precision FP performance
of the original Cell Processor.

Depending on various factors such as having 7 or 8 SPEs active,
counting the PPE or not counting it, and clockspeed, the original CELL
could manage 218 to 256 to just under 300 GFLOPs of single precision
FP. When double precision is needed performance drops massively, down
to around 25 GFLOPs.

The IBM PowerXCell 8i is said to be capable of over 100 GFLOPs
double precision. That's a huge increase without adding more SPEs or
upping clockspeed.

PowerXCell 8i cannot be considered a next-generation CELL, only an
enhanced first-gen CELL.

IBM plans to put 32 SPEs on the next-gen CELL to hit 1 TFLOP (single
precision I would imagine) in a single chip by 2010. There was also
an official roadmap that showed a CELL with 64 SPEs on a process
smaller than 45nm (be it 32nm, 22nm, I don't know). I posted about
both in the past.

It's clear that the IBM-Toshiba-Sony CELL is proving to be much more
useful beyond PS3 than the Sony-Toshiba 'Emotion Engine' ever was,
which really had no use outside of PS2 and cheap, home-made university
"supercomputers' such as the one using 60 or 70 PS2s at UIC in IL.

Roadrunner is serious stuff, and it's only the beginning. In the next
decade we'll see more powerful supercomputers using next-gen CELLs.
  #8  
Old June 11th 08, 03:53 PM posted to comp.sys.super,comp.arch,alt.comp.hardware.amd.x86-64,comp.sys.ibm.pc.hardware.chips,alt.games.video.sony-playstation3
Neal
external usenet poster
 
Posts: 15
Default Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance

On Jun 11, 7:32 am, Air Raid wrote:
On Jun 11, 4:32 am, wrote:

On the positive note, successful testing of the Roadrunner means that
IBM has the ability to manufacture a new variant of Cell with fully-
pipelined double-precision FPU in production quantity.
IBM web site indicates that a new engine is available to mere mortals:http://www-03.ibm.com/systems/bladec...ers/qs22/index...


Interesting thing about the IBM PowerXCell 8i Processor is that it
offers 4 to 5 (IBM says 5) times the double precision FP performance
of the original Cell Processor.

Depending on various factors such as having 7 or 8 SPEs active,
counting the PPE or not counting it, and clockspeed, the original CELL
could manage 218 to 256 to just under 300 GFLOPs of single precision
FP. When double precision is needed performance drops massively, down
to around 25 GFLOPs.

The IBM PowerXCell 8i is said to be capable of over 100 GFLOPs
double precision. That's a huge increase without adding more SPEs or
upping clockspeed.

PowerXCell 8i cannot be considered a next-generation CELL, only an
enhanced first-gen CELL.

IBM plans to put 32 SPEs on the next-gen CELL to hit 1 TFLOP (single
precision I would imagine) in a single chip by 2010. There was also
an official roadmap that showed a CELL with 64 SPEs on a process
smaller than 45nm (be it 32nm, 22nm, I don't know). I posted about
both in the past.

It's clear that the IBM-Toshiba-Sony CELL is proving to be much more
useful beyond PS3 than the Sony-Toshiba 'Emotion Engine' ever was,
which really had no use outside of PS2 and cheap, home-made university
"supercomputers' such as the one using 60 or 70 PS2s at UIC in IL.

Roadrunner is serious stuff, and it's only the beginning. In the next
decade we'll see more powerful supercomputers using next-gen CELLs.


Though the CELL may prove more useful, it still has some serious arch
issues that a lot of people don't like. Programming the thing is not
the easiest thing in the world to do (though parallel programming
models for CMPs are somewhat of an open issue). It's nice that they
will continue to push performance, but many GPUs will be well above
1TFLOPS SPFP by 2010... which means that obviously Larrabee will be
out then. While I'm glad that CELL came out as it's enlightened the
world and solved several of the multicore integration problems, it
simply doesn't seem like the chip of the future right now. I simply
haven't heard a whole lot of interest from the HPC community on CELL,
but you never know... I could be wrong.
  #9  
Old June 11th 08, 05:10 PM posted to comp.sys.super,comp.arch,alt.comp.hardware.amd.x86-64,comp.sys.ibm.pc.hardware.chips,alt.games.video.sony-playstation3
Robert Myers
external usenet poster
 
Posts: 606
Default Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance

On Jun 11, 5:22 am, wrote:


I'd agree with mr deo.
Roadrunner interconnects look like a big step backward from other PR-
heavy American supercomputers.


http://www.lanl.gov/orgs/hpc/roadrun...8_RR_model.pdf

The predicted worst-case latency is about the same as Blue-Gene. Red
Storm routing/switching looks like Blue Gene. Columbia uses both
Infiniband and Numalink in a fat tree like Roadrunner.

Hopefully, they are not as boring as in
Virginia Tech cluster that you mentioned, but not in the same class as
Cray/Sandia Red Storm, SGI/NASA Columbia or IBM/LLNL BlueGene/L.

Could you be more specific than, "Hopefully they are not as boring?"
Blue Gene and Red Storm have an interconnect topology that's fine for
problems with good locality. Not so good for problems requiring
global communication. Columbia and Roadrunner use fat trees, much
better for the problems that interest me the most.

Also it looks like programming model for Roadrunner (3 separate
architecture and many programmer-visible levels of memory hierarchy)
is much harder to use effectively than just about any big
supercomputer built up to date.

Coprocessors are here to stay. At the moment, they will be painfully
visible to the programmer.

Last, but not least, performance per watt and per cubic meter seem
seriously worse than BlueGene.


Not important for the time being. FP-intensive coprocessors can be
very effective in flops/watt. Neither the OOO-processor nor the FP
coprocessor was optimized for power performance. The more important
point, for the moment, is to show that all those flops can actually be
put to use.

To get back to the previous poster's point, building a machine with
lots of flops requires only money. Actually using those flops for
something other than linpack is another matter.

Robert.

  #10  
Old June 11th 08, 08:36 PM posted to comp.sys.super,comp.arch,alt.comp.hardware.amd.x86-64,comp.sys.ibm.pc.hardware.chips,alt.games.video.sony-playstation3
Cydrome Leader
external usenet poster
 
Posts: 113
Default Roadrunner Supercomputer using 12,960 CELL Processors Hits 1 PetaFlop ?(1000 TeraFlops) of double-precision FP Performance

In comp.sys.super AirRaid wrote:
Yes, this is cross-posted, because IMO it deserves to be. This is a
milestone in computing performance, thus, IMO, it should be as widely
posted as possible.

This isn't spam whatsoever, so if you want to get ****ed that it's a
cross post, okay,
just don't think it's spam.

I posted about the Roadrunner supercomputer back in September 2006
when it was announced. Now it's up and running.




http://graphics8.nytimes.com/images/...op.enlarge.jpg

Military Supercomputer Sets Record

By JOHN MARKOFF
Published: June 9, 2008

SAN FRANCISCO ? An American military supercomputer, assembled from
components originally designed for video game machines, has reached a
long-sought-after computing milestone by processing more than 1.026
quadrillion calculations per second.


The Roadrunner supercomputer costs $133 million and will be used to
study nuclear weapons.

The new machine is more than twice as fast as the previous fastest
supercomputer, the I.B.M. BlueGene/L, which is based at Lawrence
Livermore National Laboratory in California.

The new $133 million supercomputer, called Roadrunner in a reference
to the state bird of New Mexico, was devised and built by engineers
and scientists at I.B.M. and Los Alamos National Laboratory, based in
Los Alamos, N.M. It will be used principally to solve classified
military problems to ensure that the nation?s stockpile of nuclear
weapons will continue to work correctly as they age. The Roadrunner
will simulate the behavior of the weapons in the first fraction of a
second during an explosion.


I wonder what they really do with these computers.

I find it unlikely they still don't know how nuclear weapons work,
especially considering they're mature technology and they've been around
for decades.
 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1 PetaFlop(1000 TeraFlops) of double-precision FP Performance AirRaid General 29 June 13th 08 10:14 PM
IBM to build Opteron-Cell hybrid supercomputer of 1 PetaFlop performance [email protected] General 27 September 15th 06 01:22 PM
IBM to build Opteron-Cell hybrid supercomputer of 1 PetaFlop performance [email protected] AMD x86-64 Processors 27 September 15th 06 01:22 PM
IBM to build Opteron-Cell hybrid supercomputer of 1 PetaFlop performance AirRaid AMD x86-64 Processors 1 September 8th 06 09:48 PM
IBM to build Opteron-Cell hybrid supercomputer of 1 PetaFlop performance [email protected] General 0 September 6th 06 02:00 AM


All times are GMT +1. The time now is 10:17 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.