Chipmakers aim to unclog data paths
Published: 20 Aug 2007 11:01 BST
If you are going to build processors with large numbers of cores, argues Anant Agarwal, you have to figure out how to connect them to each other, too.
A decade of research into that problem has resulted in Agarwal's company, Tilera. The company has invented a 64-core processor with an embedded high-speed network that can pass up to 32 terabits of data a second between the various cores.
The company's Tile64 — designed for networking equipment and video-streaming servers — can provide 10 times the performance of an Intel Xeon chip while consuming far less power, or 40 times the performance of a digital signal processor from Texas Instruments, the company says.
And 64 cores is just the start.
Agarwal and other executives from the company will discuss the architecture further on Monday at the Hot Chips conference at Stanford University in Palo Alto, California. Researchers at Intel, IBM, AMD and the University of Texas, among others, will also present papers.
Promising chip companies with strong technical backgrounds rise and fade out on a regular basis in the semiconductor industry. Still, Tilera is trying to tackle one of the thorniest — and thus one of the potentially most lucrative — problems for computer designers today: slow, clogged data paths. Processor speed and transistor count has climbed at a rapid, steady pace for decades, but the buses and interconnects between them get upgraded at a much slower rate.
HyperTransport, found inside processors from AMD, has probably been the most significant achievement in this regard in the last decade. HyperTransport accounted for a substantial percentage of the performance gains AMD achieved with the Athlon chip.
"The fundamental limitation of CPUs is no longer [core] performance but I/O [input/output]," Andy Bechtolsheim said in a presentation to reporters in June on Sun's efforts in supercomputing. "You don't get more I/O just because you shrink the manufacturing process."
Sun has been working on a technology called proximity communication that allows different chips to talk to each other without wires by virtue of just being close. It's not ready yet.
Last September, Intel's Justin Rattner unveiled Intel's proposed answer: an 80-core chip in which the cores are linked through an embedded network.
The Intel chip is conceptually similar to the Tile64, Agarwal said. Intel, though, has given itself five years to come out with 80-core chips.
Tilera has already delivered samples to customers and will start shipping chips commercially in the fourth quarter. It has 12 customers including networking gear manufacturers 3Com and TopLayer.
Intel's 80-core chip, however, also contains Through Silicon Vias, which unclog the processor-to-memory pathways. The Tile64 uses conventional memory controllers.
Under the hood
Tilera's chips consist of small, individual building blocks, or tiles. Each tile sports a RISC processing core that runs at 600MHz to 1GHz as well as a switch that can send data in four directions: up, down, right and left. These switches form a mesh network, called iMesh, that lets the chips communicate.
Competition
How messy is your server room?
We're launching a contest to find the quintessential messy server room, with a fantastic prize up for grabs. So come on, show us how chaotic a server room can get!
The mesh network itself is also divided up into five layers, depending on the type of transaction. One layer handles cache-to-cache transfers, while another handles streaming data.
Each tile contains two caches of memory for rapid data access. Although each tile contains its own cache, the tiles can access all of the cache (depending on how it's programmed).
Individual tiles consume a low 170 milliwatts to 300 milliwatts on average. Cores also power up and down independently when not in use to cut power consumption.
The size of the chip, and its ultimate performance, depend on how many tiles are included. The first product will contain 64 tiles and a 5MB distributed cache. Next year, the company says it will come out with a less expensive 36-tile version and then a 120-tile version close to, or in, 2009. Tiles on a single chip can be grouped into virtual processors assigned to different computing tasks.
Performance gains over conventional chips arise directly out of Tile64's design. A distributed network of slower processors can get jobs done quicker and with less overall energy than two or four larger, faster, more complex cores. Rather than powering a large bus, the chip can rely on shorter connections.
So what needs this sort of computing power? Firewalls, according to Agarwal. The avalanche of spam has created a market for networking devices…




