If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1 PetaFlop(1000 TeraFlops) of double-precision FP Performance
Yes, this is cross-posted, because IMO it deserves to be. This is a
milestone in computing performance, thus, IMO, it should be as widely posted as possible. This isn't spam whatsoever, so if you want to get ****ed that it's a cross post, okay, just don't think it's spam. I posted about the Roadrunner supercomputer back in September 2006 when it was announced. Now it's up and running. http://graphics8.nytimes.com/images/...op.enlarge.jpg Military Supercomputer Sets Record By JOHN MARKOFF Published: June 9, 2008 SAN FRANCISCO — An American military supercomputer, assembled from components originally designed for video game machines, has reached a long-sought-after computing milestone by processing more than 1.026 quadrillion calculations per second. The Roadrunner supercomputer costs $133 million and will be used to study nuclear weapons. The new machine is more than twice as fast as the previous fastest supercomputer, the I.B.M. BlueGene/L, which is based at Lawrence Livermore National Laboratory in California. The new $133 million supercomputer, called Roadrunner in a reference to the state bird of New Mexico, was devised and built by engineers and scientists at I.B.M. and Los Alamos National Laboratory, based in Los Alamos, N.M. It will be used principally to solve classified military problems to ensure that the nation’s stockpile of nuclear weapons will continue to work correctly as they age. The Roadrunner will simulate the behavior of the weapons in the first fraction of a second during an explosion. Before it is placed in a classified environment, it will also be used to explore scientific problems like climate change. The greater speed of the Roadrunner will make it possible for scientists to test global climate models with higher accuracy. To put the performance of the machine in perspective, Thomas P. D’Agostino, the administrator of the National Nuclear Security Administration, said that if all six billion people on earth used hand calculators and performed calculations 24 hours a day and seven days a week, it would take them 46 years to do what the Roadrunner can in one day. The machine is an unusual blend of chips used in consumer products and advanced parallel computing technologies. The lessons that computer scientists learn by making it calculate even faster are seen as essential to the future of both personal and mobile consumer computing. The high-performance computing goal, known as a petaflop — one thousand trillion calculations per second — has long been viewed as a crucial milestone by military, technical and scientific organizations in the United States, as well as a growing group including Japan, China and the European Union. All view supercomputing technology as a symbol of national economic competitiveness. By running programs that find a solution in hours or even less time — compared with as long as three months on older generations of computers — petaflop machines like Roadrunner have the potential to fundamentally alter science and engineering, supercomputer experts say. Researchers can ask questions and receive answers virtually interactively and can perform experiments that would previously have been impractical. “This is equivalent to the four-minute mile of supercomputing,” said Jack Dongarra, a computer scientist at the University of Tennessee who for several decades has tracked the performance of the fastest computers. Each new supercomputing generation has brought scientists a step closer to faithfully simulating physical reality. It has also produced software and hardware technologies that have rapidly spilled out into the rest of the computer industry for consumer and business products. Technology is flowing in the opposite direction as well. Consumer- oriented computing began dominating research and development spending on technology shortly after the cold war ended in the late 1980s, and that trend is evident in the design of the world’s fastest computers. The Roadrunner is based on a radical design that includes 12,960 chips that are an improved version of an I.B.M. Cell microprocessor, a parallel processing chip originally created for Sony’s PlayStation 3 video-game machine. The Sony chips are used as accelerators, or turbochargers, for portions of calculations. The Roadrunner also includes a smaller number of more conventional Opteron processors, made by Advanced Micro Devices, which are already widely used in corporate servers. “Roadrunner tells us about what will happen in the next decade,” said Horst Simon, associate laboratory director for computer science at the Lawrence Berkeley National Laboratory. “Technology is coming from the consumer electronics market and the innovation is happening first in terms of cellphones and embedded electronics.” The innovations flowing from this generation of high-speed computers will most likely result from the way computer scientists manage the complexity of the system’s hardware. Roadrunner, which consumes roughly three megawatts of power, or about the power required by a large suburban shopping center, requires three separate programming tools because it has three types of processors. Programmers have to figure out how to keep all of the 116,640 processor cores in the machine occupied simultaneously in order for it to run effectively. “We’ve proved some skeptics wrong,” said Michael R. Anastasio, a physicist who is director of the Los Alamos National Laboratory. “This gives us a window into a whole new way of computing. We can look at phenomena we have never seen before.” Solving that programming problem is important because in just a few years personal computers will have microprocessor chips with dozens or even hundreds of processor cores. The industry is now hunting for new techniques for making use of the new computing power. Some experts, however, are skeptical that the most powerful supercomputers will provide useful examples. “If Chevy wins the Daytona 500, they try to convince you the Chevy Malibu you’re driving will benefit from this,” said Steve Wallach, a supercomputer designer who is chief scientist of Convey Computer, a start-up firm based in Richardson, Tex. Those who work with weapons might not have much to offer the video gamers of the world, he suggested. Many executives and scientists see Roadrunner as an example of the resurgence of the United States in supercomputing. Although American companies had dominated the field since its inception in the 1960s, in 2002 the Japanese Earth Simulator briefly claimed the title of the world’s fastest by executing more than 35 trillion mathematical calculations per second. Two years later, a supercomputer created by I.B.M. reclaimed the speed record for the United States. The Japanese challenge, however, led Congress and the Bush administration to reinvest in high-performance computing. “It’s a sign that we are maintaining our position,“ said Peter J. Ungaro, chief executive of Cray, a maker of supercomputers. He noted, however, that “the real competitiveness is based on the discoveries that are based on the machines.” Having surpassed the petaflop barrier, I.B.M. is already looking toward the next generation of supercomputing. “You do these record- setting things because you know that in the end we will push on to the next generation and the one who is there first will be the leader,” said Nicholas M. Donofrio, an I.B.M. executive vice president. By breaking the petaflop barrier sooner than had been generally expected, the United States’ supercomputer industry has been able to sustain a pace of continuous performance increases, improving a thousandfold in processing power in 11 years. The next thousandfold goal is the exaflop, which is a quintillion calculations per second, followed by the zettaflop, the yottaflop and the xeraflop. http://www.nytimes.com/2008/06/09/te...aflops.html?hp other articles: http://www.engadget.com/2008/06/09/w...eaks-petaflop/ http://www.theregister.co.uk/2008/06...omputer_debut/ http://www.itproportal.com/articles/...aflop-barrier/ http://www.theinquirer.net/gb/inquir...litary-achieve |
#2
|
|||
|
|||
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1 PetaFlop (1000 TeraFlops) of double-precision FP Performance
"AirRaid" wrote in message ... Yes, this is cross-posted, because IMO it deserves to be. This is a milestone in computing performance, thus, IMO, it should be as widely posted as possible. This isn't spam whatsoever, so if you want to get ****ed that it's a cross post, okay, just don't think it's spam. I posted about the Roadrunner supercomputer back in September 2006 when it was announced. Now it's up and running. Nothing special about it tbh. If we had moved into DNA, Chemical, or Optical processors to manage such a task then I would say it is special.. The fact that they have managed to bolt on different types of processors onto a board (think GPU, CPU, and MMPU) and then link the boards together (Think NIC) really doesnt show any innovation. I know my example isnt the same as what we have going on in these servers, but in essence, anyone who wants to throw 400m into a project can see them double what this kit is doing, so it's nothing but a monetary step. |
#3
|
|||
|
|||
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1 PetaFlop (1000 TeraFlops) of double-precision FP Performance
"AirRaid" wrote in message
... Yes, this is cross-posted, because IMO it deserves to be. This is a milestone in computing performance, thus, IMO, it should be as widely posted as possible. This isn't spam whatsoever, so if you want to get ****ed that it's a cross post, okay, just don't think it's spam. I posted about the Roadrunner supercomputer back in September 2006 when it was announced. Now it's up and running. Nothing special about it tbh. If we had moved into DNA, Chemical, or Optical processors to manage such a task then I would say it is special.. The fact that they have managed to bolt on different types of processors onto a board (think GPU, CPU, and MMPU) and then link the boards together (Think NIC) really doesnt show any innovation. I know my example isnt the same as what we have going on in these servers, but in essence, anyone who wants to throw 400m into a project can see them double what this kit is doing, so it's nothing but a monetary step. (damn your HTML original post) |
#4
|
|||
|
|||
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance
On Jun 10, 5:34 am, "mr deo"
wrote: Nothing special about it tbh. If we had moved into DNA, Chemical, or Optical processors to manage such a task then I would say it is special.. The fact that they have managed to bolt on different types of processors onto a board (think GPU, CPU, and MMPU) and then link the boards together (Think NIC) really doesnt show any innovation. I know my example isnt the same as what we have going on in these servers, but in essence, anyone who wants to throw 400m into a project can see them double what this kit is doing, so it's nothing but a monetary step. It's hard to know what your perspective could be. The general question is: what could you do if you coupled a high-end out-of-order general-purpose CPU with a beefy compute engine and serious interconnect. Cell (or similar compute-intensive hardware) has important implications for power consumption. Balancing the memory, the bandwidth of all the interconnects, the switches, and the compute density of each node (how many processors? connected how?) is more than just buying a bunch of Macs and hooking them together with one gigabyte ethernet. I'd like to see a serious discussion of this machine, but this thread with all these cross-posts isn't the right place to be doing it. The one petabyte crap is pure national labs PR. The interesting stuff is elsewhere. I'm sorry if you can't see that. Robert. |
#5
|
|||
|
|||
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance
On Jun 10, 7:19 pm, Robert Myers wrote:
On Jun 10, 5:34 am, "mr deo" wrote: Nothing special about it tbh. If we had moved into DNA, Chemical, or Optical processors to manage such a task then I would say it is special.. The fact that they have managed to bolt on different types of processors onto a board (think GPU, CPU, and MMPU) and then link the boards together (Think NIC) really doesnt show any innovation. I know my example isnt the same as what we have going on in these servers, but in essence, anyone who wants to throw 400m into a project can see them double what this kit is doing, so it's nothing but a monetary step. It's hard to know what your perspective could be. The general question is: what could you do if you coupled a high-end out-of-order general-purpose CPU with a beefy compute engine and serious interconnect. Cell (or similar compute-intensive hardware) has important implications for power consumption. Balancing the memory, the bandwidth of all the interconnects, the switches, and the compute density of each node (how many processors? connected how?) is more than just buying a bunch of Macs and hooking them together with one gigabyte ethernet. I'd like to see a serious discussion of this machine, but this thread with all these cross-posts isn't the right place to be doing it. The one petabyte crap is pure national labs PR. The interesting stuff is elsewhere. I'm sorry if you can't see that. Robert. I'd agree with mr deo. Roadrunner interconnects look like a big step backward from other PR- heavy American supercomputers. Hopefully, they are not as boring as in Virginia Tech cluster that you mentioned, but not in the same class as Cray/Sandia Red Storm, SGI/NASA Columbia or IBM/LLNL BlueGene/L. Also it looks like programming model for Roadrunner (3 separate architecture and many programmer-visible levels of memory hierarchy) is much harder to use effectively than just about any big supercomputer built up to date. Last, but not least, performance per watt and per cubic meter seem seriously worse than BlueGene. |
#6
|
|||
|
|||
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance
On the positive note, successful testing of the Roadrunner means that
IBM has the ability to manufacture a new variant of Cell with fully- pipelined double-precision FPU in production quantity. IBM web site indicates that a new engine is available to mere mortals: http://www-03.ibm.com/systems/bladec...s22/index.html |
#7
|
|||
|
|||
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance
On Jun 11, 4:32 am, wrote:
On the positive note, successful testing of the Roadrunner means that IBM has the ability to manufacture a new variant of Cell with fully- pipelined double-precision FPU in production quantity. IBM web site indicates that a new engine is available to mere mortals:http://www-03.ibm.com/systems/bladec...ers/qs22/index... Interesting thing about the IBM PowerXCell 8i Processor is that it offers 4 to 5 (IBM says 5) times the double precision FP performance of the original Cell Processor. Depending on various factors such as having 7 or 8 SPEs active, counting the PPE or not counting it, and clockspeed, the original CELL could manage 218 to 256 to just under 300 GFLOPs of single precision FP. When double precision is needed performance drops massively, down to around 25 GFLOPs. The IBM PowerXCell 8i is said to be capable of over 100 GFLOPs double precision. That's a huge increase without adding more SPEs or upping clockspeed. PowerXCell 8i cannot be considered a next-generation CELL, only an enhanced first-gen CELL. IBM plans to put 32 SPEs on the next-gen CELL to hit 1 TFLOP (single precision I would imagine) in a single chip by 2010. There was also an official roadmap that showed a CELL with 64 SPEs on a process smaller than 45nm (be it 32nm, 22nm, I don't know). I posted about both in the past. It's clear that the IBM-Toshiba-Sony CELL is proving to be much more useful beyond PS3 than the Sony-Toshiba 'Emotion Engine' ever was, which really had no use outside of PS2 and cheap, home-made university "supercomputers' such as the one using 60 or 70 PS2s at UIC in IL. Roadrunner is serious stuff, and it's only the beginning. In the next decade we'll see more powerful supercomputers using next-gen CELLs. |
#8
|
|||
|
|||
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance
On Jun 11, 7:32 am, Air Raid wrote:
On Jun 11, 4:32 am, wrote: On the positive note, successful testing of the Roadrunner means that IBM has the ability to manufacture a new variant of Cell with fully- pipelined double-precision FPU in production quantity. IBM web site indicates that a new engine is available to mere mortals:http://www-03.ibm.com/systems/bladec...ers/qs22/index... Interesting thing about the IBM PowerXCell 8i Processor is that it offers 4 to 5 (IBM says 5) times the double precision FP performance of the original Cell Processor. Depending on various factors such as having 7 or 8 SPEs active, counting the PPE or not counting it, and clockspeed, the original CELL could manage 218 to 256 to just under 300 GFLOPs of single precision FP. When double precision is needed performance drops massively, down to around 25 GFLOPs. The IBM PowerXCell 8i is said to be capable of over 100 GFLOPs double precision. That's a huge increase without adding more SPEs or upping clockspeed. PowerXCell 8i cannot be considered a next-generation CELL, only an enhanced first-gen CELL. IBM plans to put 32 SPEs on the next-gen CELL to hit 1 TFLOP (single precision I would imagine) in a single chip by 2010. There was also an official roadmap that showed a CELL with 64 SPEs on a process smaller than 45nm (be it 32nm, 22nm, I don't know). I posted about both in the past. It's clear that the IBM-Toshiba-Sony CELL is proving to be much more useful beyond PS3 than the Sony-Toshiba 'Emotion Engine' ever was, which really had no use outside of PS2 and cheap, home-made university "supercomputers' such as the one using 60 or 70 PS2s at UIC in IL. Roadrunner is serious stuff, and it's only the beginning. In the next decade we'll see more powerful supercomputers using next-gen CELLs. Though the CELL may prove more useful, it still has some serious arch issues that a lot of people don't like. Programming the thing is not the easiest thing in the world to do (though parallel programming models for CMPs are somewhat of an open issue). It's nice that they will continue to push performance, but many GPUs will be well above 1TFLOPS SPFP by 2010... which means that obviously Larrabee will be out then. While I'm glad that CELL came out as it's enlightened the world and solved several of the multicore integration problems, it simply doesn't seem like the chip of the future right now. I simply haven't heard a whole lot of interest from the HPC community on CELL, but you never know... I could be wrong. |
#9
|
|||
|
|||
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1PetaFlop (1000 TeraFlops) of double-precision FP Performance
On Jun 11, 5:22 am, wrote:
I'd agree with mr deo. Roadrunner interconnects look like a big step backward from other PR- heavy American supercomputers. http://www.lanl.gov/orgs/hpc/roadrun...8_RR_model.pdf The predicted worst-case latency is about the same as Blue-Gene. Red Storm routing/switching looks like Blue Gene. Columbia uses both Infiniband and Numalink in a fat tree like Roadrunner. Hopefully, they are not as boring as in Virginia Tech cluster that you mentioned, but not in the same class as Cray/Sandia Red Storm, SGI/NASA Columbia or IBM/LLNL BlueGene/L. Could you be more specific than, "Hopefully they are not as boring?" Blue Gene and Red Storm have an interconnect topology that's fine for problems with good locality. Not so good for problems requiring global communication. Columbia and Roadrunner use fat trees, much better for the problems that interest me the most. Also it looks like programming model for Roadrunner (3 separate architecture and many programmer-visible levels of memory hierarchy) is much harder to use effectively than just about any big supercomputer built up to date. Coprocessors are here to stay. At the moment, they will be painfully visible to the programmer. Last, but not least, performance per watt and per cubic meter seem seriously worse than BlueGene. Not important for the time being. FP-intensive coprocessors can be very effective in flops/watt. Neither the OOO-processor nor the FP coprocessor was optimized for power performance. The more important point, for the moment, is to show that all those flops can actually be put to use. To get back to the previous poster's point, building a machine with lots of flops requires only money. Actually using those flops for something other than linpack is another matter. Robert. |
#10
|
|||
|
|||
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1 PetaFlop ?(1000 TeraFlops) of double-precision FP Performance
In comp.sys.super AirRaid wrote:
Yes, this is cross-posted, because IMO it deserves to be. This is a milestone in computing performance, thus, IMO, it should be as widely posted as possible. This isn't spam whatsoever, so if you want to get ****ed that it's a cross post, okay, just don't think it's spam. I posted about the Roadrunner supercomputer back in September 2006 when it was announced. Now it's up and running. http://graphics8.nytimes.com/images/...op.enlarge.jpg Military Supercomputer Sets Record By JOHN MARKOFF Published: June 9, 2008 SAN FRANCISCO ? An American military supercomputer, assembled from components originally designed for video game machines, has reached a long-sought-after computing milestone by processing more than 1.026 quadrillion calculations per second. The Roadrunner supercomputer costs $133 million and will be used to study nuclear weapons. The new machine is more than twice as fast as the previous fastest supercomputer, the I.B.M. BlueGene/L, which is based at Lawrence Livermore National Laboratory in California. The new $133 million supercomputer, called Roadrunner in a reference to the state bird of New Mexico, was devised and built by engineers and scientists at I.B.M. and Los Alamos National Laboratory, based in Los Alamos, N.M. It will be used principally to solve classified military problems to ensure that the nation?s stockpile of nuclear weapons will continue to work correctly as they age. The Roadrunner will simulate the behavior of the weapons in the first fraction of a second during an explosion. I wonder what they really do with these computers. I find it unlikely they still don't know how nuclear weapons work, especially considering they're mature technology and they've been around for decades. |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Roadrunner Supercomputer using 12,960 CELL Processors Hits 1 PetaFlop(1000 TeraFlops) of double-precision FP Performance | AirRaid | General | 29 | June 13th 08 10:14 PM |
IBM to build Opteron-Cell hybrid supercomputer of 1 PetaFlop performance | [email protected] | General | 27 | September 15th 06 01:22 PM |
IBM to build Opteron-Cell hybrid supercomputer of 1 PetaFlop performance | [email protected] | AMD x86-64 Processors | 27 | September 15th 06 01:22 PM |
IBM to build Opteron-Cell hybrid supercomputer of 1 PetaFlop performance | AirRaid | AMD x86-64 Processors | 1 | September 8th 06 09:48 PM |
IBM to build Opteron-Cell hybrid supercomputer of 1 PetaFlop performance | [email protected] | General | 0 | September 6th 06 02:00 AM |