ZDNet UK


Skip to Main Content

ZDNet.co.uk - Winner of Best Business Website 2007
  1. Home
  2. News
  3. Blogs
  4. Reviews
  5. Prices
  6. Resources
  7. Community
  8. My ZDNet

 

ZDNet UK RSS Feeds


IT Jobs

Application development Toolkit

IBM releases supercomputer details

Stephen Shankland, CNET News.com CNET News.com

Published: 08 May 2003 12:03 BST

  • Email
  • Trackback
  • Clip Link
  • Print friendly
  • Post Comment

IBM is shedding light on a programme to create the world's fastest supercomputer, illuminating a dual-pronged strategy, an unusual new processor design and a proclivity for the Linux operating system.

"Blue Gene" is an ambitious project to expand the horizons of supercomputing, with the ultimate goal of creating a system that can perform one quadrillion calculations per second, or one petaflop. IBM expects a machine it calls Blue Gene/P to be the first to achieve the computational milestone. Today's fastest machine, NEC's Earth Simulator is comparatively slow -- about one-thirtieth of a petaflop -- but fast enough to worry the United States government that the country is losing its computing lead to Japan.

"Blue Gene is a completely odd-ball, you've-never-seen-anything-like-this-before design," said Illuminata analyst Jonathan Eunice. "It is not custom everything, (but) it is still very exotic compared to anything you can buy."

IBM has begun building the chips that will be used in the first Blue Gene, a machine dubbed Blue Gene/L that will run Linux and have more than 65,000 computing nodes, said Bill Pulleyblank, director of IBM's Deep Computing Institute and the executive overseeing the project. Each node has a small chip with an unusually large number of functions crammed onto the single slice of silicon: two processors, four accompanying mathematical engines, 4MB of memory and communication systems for five separate networks.

Joining Blue Gene/L is a second major experimental system called "Cyclops," which in comparison will have many more processors etched onto each slice of silicon -- perhaps as many as 64, Pulleyblank said.

In addition, IBM probably will use the Linux operating system on all the members of the Blue Gene family, not just Blue Gene/L. "My belief is that's definitely where we're going to go," Pulleyblank said.

Blue Gene's original mission was to tackle the computationally onerous task of using the laws of physics to predict how chains of biochemical building blocks described by DNA fold into proteins -- massive molecules such as haemoglobin or testosterone. IBM has expanded its mission, though, to other subjects including global climate simulation and financial risk analysis.

"We're looking at a broad suite of applications," Pulleyblank said, a move that will help IBM reach one of the goals of the Blue Gene project: to produce technology that customers ultimately will pay for.

IBM already has spent more than the original $100m budgeted for the project and won't meet its 2004 goal for the ultimate machine, but the company has made progress bringing its ideas to fruition.

IBM is building the processors for the first member of the Blue Gene family, Blue Gene/L, and expects to use them this year in a machine that will be a microcosm of the eventual full-fledged Blue Gene/L due by the end of 2004, Pulleyblank said. IBM also has begun designing the processors for Cyclops, which IBM internally calls Blue Gene/C.

The performance results of Blue Gene/L and Cyclops will determine the design IBM chooses for the eventual petaflop machine, Blue Gene/P, Pulleyblank said.

"The only thing that's sure is it will be an...architecture that will have massive amounts of parallelism in it. It will be a very power-efficient, space-efficient design," Pulleyblank said. How IBM reaches its petaflop-and-beyond goal is "going to depend in large part on what we find out when we start running on Blue Gene/L."

There are differences from what IBM originally envisioned. For one thing, the processors will be based on IBM's PowerPC 440GX processor instead of being designed from scratch. It's cooled by air instead of water. It has a different network. And there's less memory, though still a whopping 16 terabytes total.

Blue Gene/L will be large, but significantly smaller than current IBM supercomputers such as ASCI White, a nuclear weapons simulation machine at Lawrence Livermore National Laboratory, which will also be the home of Blue Gene/L. ASCI White takes up the area of two basketball courts, or 9,400 square feet, while Blue Gene/L should fit into half a tennis court, or about 1,400 square feet.

IBM's Blue Gene research has an academic flavour, but the company's ultimate goal is profit. IBM is second only to Hewlett-Packard in the $4.7bn market for high-performance technical computing machines. From 2001 to 2002, IBM's sales grew 28 percent from $1.04bn to $1.33bn, while HP's shrank 25 percent from $2.1bn to $1.58 bn, according to research firm IDC.

Like an automaker sponsoring a winning race car, building cutting-edge computers can bring bragging rights that can help attract top engineers and convince customers that a company has sound long-term plans.

The design of Blue Gene/L
Blue Gene/L is an exercise in powers of two, starting with each of the 65,536 compute nodes. Each of the dual processors on the compute node has two "floating point units," engines for performing mathematical calculations.

Each node's chip is 121 square millimetres and built on a manufacturing process with 130-nanometre features, Pulleyblank said. That compares with 267 square millimetres for IBM's current flagship processor, the Power4+ used in its top-end Unix servers. The small size for Blue Gene's chips is crucial to ensure the chips don't emit too much waste heat, which would prevent engineers from packing them densely enough.

Two nodes are mounted onto a module; 16 modules fit into a chassis; and 32 chasses are mounted into a rack. A total of 64 racks will be installed at the Livermore lab by the end of 2004, with the first 512-node half-rack prototype to be built this autumn at IBM's Thomas J. Watson Research Center.

"We're going to have first hardware this year. We are actually fabricating chips for this machine," Pulleyblank said.

All nodes are created equal, but 1,024 of them will have a more important task than the rest, Pulleyblank said. These so-called input-output, or I/O, nodes, will run an instance of Linux and assign calculations to a stable of 64 processor nodes.

These underling nodes won't run Linux, but instead a custom operating system stripped to its bare essentials, he said. When they have to perform a task they're not equipped to handle, they can pass the job up the pecking order to one of the I/O nodes.

"It will look like it has 1,024 I/O nodes, each of which manages a gang of 64 compute nodes," Pulleyblank said.

Running Linux, a move made possible by using the comparatively ordinary 440GX processor, was crucial to make the system useful. "It was absolutely clear by making it run Linux, we were opening it up to a broad range of applications we couldn't get otherwise," Pulleyblank said.

Of the two processors on each node, one will be devoted to number-crunching and the other to communicating with the rest of the system. In this configuration, the system should be able to perform at a rate of 180 teraflops, or 1 trillion calculations per second. In some cases where minimal communication between nodes is required, both processors of each node can concentrate on maths, bringing the system performance to 360 teraflops, Pulleyblank said.

Communication among the nodes is a challenge IBM tackled by employing two primary networks. The first network is a mesh that connects each node to every other one, with a message travelling from one node to another having to hop across a maximum of 64 nodes in between. The second network is a branching tree structure that can quickly deliver messages to the entire collection of nodes or gather information from them.

When a message needs to be sent, "we automatically decide the better way to route it," Pulleyblank said. "Also interesting is that if one network fails, we can still completely run with the other network, but slower."

In addition, a third network uses a conventional 1-gigabit-per-second Ethernet technology. There are two management networks besides, one to help boot nodes and one to monitor and control them.

Blue Gene has some unusual features, but IBM has tried as much as possible to anchor the system to more mainstream technology. Staying on the beaten path is the best way to take advantage of technology that's improving fastest, Pulleyblank said, and it also makes it easier to create products out of the Blue Gene research.

"Our direction has been as much as possible to exploit these standard components," he said.


See the Hardware News Section for the latest update on everything from MP3 players and PDAs to supercomputing.

Let the editors know what you think in the Mailroom.

  • Email
  • Trackback
  • Clip Link
  • Print friendly Print with Dell

Did you find this article useful?
38 out of 80 people found this useful


Full Talkback thread

0 comments

Company/Topic Alerts

Create a new alert from the list below:






Related Jobs

FPGS Design - Cambridge - 60K - Perm

Experience with one or more of: mathematical modelling (Matlab/Scilab), digital hardware and embedded software (preferably Linux) would be ...

Fantastic opportunity for experienced Quantitative Developer

A university education preferably from a numerical discipline, - Solid analytical, mathematical and problem solving skills, and - An understanding of ...

Senior Modeller with mathematical/C# background Bham, up to 43k

C# & mathematical degree essential Simulation experience highly desirable A successful simulation company renown for delivering business models are ...

Discussions

AdamW AdamW

Linux, Laptops and Dual Displays

Saturday 26 July 2008, 6:34 PM

2 comments
keithmv keithmv

Password Deadlock

Saturday 26 July 2008, 12:02 PM

2 comments

Featured Talkback

The fact is: Software developers today are really designers and not coders. The reason that business anlaysts exist today to model solutions is because they understand the value of designing software before writing it. All too often developers create code that has little value because they do not understand that business classes interact with other classes within the confines of a working model or pattern.

By: 1000165269

Read full story:
Making sense of agile modelling