Tilera to Introduce 64-Core Processor By Andy Patrizio An MIT-inspired startup will introduce a new multi-core chip today at the annual Hot Chips conference at Stanford University. The TILE64 boasts a "clean sheet" design, unencumbered by any legacy compatibility concerns, that Tilera says will provide a huge leap in multithreaded performance. Tilera was founded in 2004 to bring to market the multi-core processor designs of MIT researcher Anant Agarwal. Agarwal created what he called a "mesh" multi-core architecture, where the cores are all interconnected rather than going through a frontside bus, as Intel's multi-core chips do. Agarwal first created this multi-core architecture in 1996, long before Intel and AMD were anywhere close to doing it. The project received funding from the Defense Advanced Research Project Agency (DARPA) and the National Science Foundation, the agency that managed the Internet for decades. Tilera holds 40-plus patents for its multi-core design. TIL64 will be the first in a series of processors built around massively multi-core chips. The TILE64 processor contains 64 full-featured, programmable cores that Tilera claims can perform 500 billion operations per second and delivers ten times the performance and thirty times the performance-per-watt of the Intel dual-core Xeon. Agarwal said the company can make these performance leaps because it doesn't use any legacy technologies or designs. "The real problem with scale is existing multi-core architectures use a bus. In that architecture, the bus is a central switch and all the cores are connected to the single central switch. A packet has to go through it no matter what, which is fine for one, two or four cores, but it does not scale," he told internetnews.com. Tilera uses a mesh architecture, where the cores are laid out in a checkerboard-like grid, all connected through high-speed interconnects. "In architectures of this sort, you can keep growing and you won't have any serious congestion," said Agarwal. Intel has promised to dispense with the frontside bus with the Nehalem architecture, due late next year. AMD does not have a frontside bus in the Opteron, but it's also using four cores at the most, while Tilera is at 64. The TILE family can scale up to even more, or down to a two-core design for the smallest of designs, such as a cell phone. Its power consumption is a few hundred milliwatts per core, Agarwal said. Its clock speed will range from 600MHz to 1GHz. But there's a lot more on the chip than just cores. It has a pair of 10 gigabit Ethernet ports directly on the chip for high speed networking, as well as on-board I/O and peripheral controllers. Its integrated memory controllers allow for up to 200 gigabits of memory bandwidth within the chip. That's what made the TILE64 chip so appealing to Top Layer, developer of network security and intrusion detection appliance. The company had built its own processors but now plans to switch to Tilera's chips, according to Chief Strategy Officer Mike Paquette. "Our software is a multi-core design, and we were able to map out functionality almost 1 for 1 for each process to a core in a Tilera chip," he said. "The performance we expect in our estimates exceeds what we could have gotten from any silicon providers." Top Layer decided to license processors for future products rather than the expense of building any more, and no other processors had the scalability. "Because the movement of data is so much of what we do, we needed a multi-core chip that was optimized for what we were doing rather than something optimized for general purpose computing Tilera has capabilities for network capabilities that are far ahead of what you can get from [x86] processors," said Paquette. Tilera will ship a full development toolkit, called the Multicore Development Environment (MDE), for building applications. It's an Eclipse-based Integrated Development Environment (IDE) with an ANSI standard C compiler, an application level library and tools for debugging and profiling multi-core processors. Wisely, Tilera is not taking on Intel and AMD right out of the gate, as Transmeta did. It's going for the embedded market. "We're focused on embedded because we are a startup and want to go into a space where there is massive demand for performance like ours. We can focus on a couple of markets and do really well in those markets by addressing customer demands squarely and don't have to go up against a dominant competitor," said Agarwal. Tilera expects to sell the TILE64 processor for $435 in lots of 10,000 units. The company is also planning a 36-core and 120-core processor for the near future. http://www.internetnews.com/ent-news/article.php/3695116
Tilera to Introduce 64-Core Processor
Started by ●October 11, 2007
Reply by ●October 11, 20072007-10-11
AirRaid wrote:> Tilera to Introduce 64-Core Processor > By Andy PatrizioOh great ... more marketing hyperbole spam from AirRaid.
Reply by ●October 11, 20072007-10-11
On Oct 12, 5:45 am, "Michael N. Moran" <mnmo...@bellsouth.net> wrote:> AirRaid wrote: > > Tilera to Introduce 64-Core Processor > > By Andy Patrizio > > Oh great ... more marketing hyperbole spam from AirRaid.Why do you say that?Michael. Can you give your opinion? Jogging
Reply by ●October 11, 20072007-10-11
joggingsong@gmail.com wrote:> On Oct 12, 5:45 am, "Michael N. Moran" <mnmo...@bellsouth.net> wrote: >> AirRaid wrote: >>> Tilera to Introduce 64-Core Processor >>> By Andy Patrizio >> Oh great ... more marketing hyperbole spam from AirRaid. > > Why do you say that?Michael. Can you give your opinion? > > Jogging >He just did....
Reply by ●October 12, 20072007-10-12
[[followups-to trimmed to only comp.arch]] In comp.arch AirRaid <AirRaidJet@gmail.com> wrote:> The TILE64 processor contains 64 full-featured, programmable > cores that Tilera claims can perform 500 billion operations per second > and delivers ten times the performance and thirty times the > performance-per-watt of the Intel dual-core Xeon.[[...]]> integrated memory controllers allow for up to 200 gigabits of memory > bandwidth within the chip.What about off-chip bandwidth -- can this keep up with 64 cores' cache misses? Is there a clever new door to get through the memory wall? -- -- "Jonathan Thornburg -- remove -animal to reply" <jthorn@soton.ac-zebra.uk> School of Mathematics, U of Southampton, England "Washing one's hands of the conflict between the powerful and the powerless means to side with the powerful, not to be neutral." -- quote by Freire / poster by Oxfam
Reply by ●October 12, 20072007-10-12
On Thu, 11 Oct 2007 11:02:14 -0700, AirRaid <AirRaidJet@gmail.com> wrote:>Tilera to Introduce 64-Core Processor >By Andy Patrizio > >"The real problem with scale is existing multi-core architectures use >a bus. In that architecture, the bus is a central switch and all the >cores are connected to the single central switch. A packet has to go >through it no matter what, which is fine for one, two or four cores, >but it does not scale," he told internetnews.com. > >Tilera uses a mesh architecture, where the cores are laid out in a >checkerboard-like grid, all connected through high-speed >interconnects. "In architectures of this sort, you can keep growing >and you won't have any serious congestion," said Agarwal.I'm a bit puzzled by this. If the cores are laid out in a checkerboard like grid, doesn't that mean each core is linked to the 8 cores around it? So it would still come up to some kind of latency bottleneck wouldn't it? What difference is it from AMD's ccHTT links except they've got a few more? Or does he mean the 64 cores are all directly connected to each other... meaning there are some 63 connections coming out of each core to every other core for some mindboggling number? (I think 63+62+61+... but my abysmal ability with maths fails me here) But essentially becoming a nightmare if the number of cores go out. So unlikely to be case, no? -- A Lost Angel, fallen from heaven Lost in dreams, Lost in aspirations, Lost to the world, Lost to myself
Reply by ●October 12, 20072007-10-12
On Fri, 12 Oct 2007 16:16:32 +0000, The little lost angel wrote:> I'm a bit puzzled by this. If the cores are laid out in a checkerboard > like grid, doesn't that mean each core is linked to the 8 cores around > it? So it would still come up to some kind of latency bottleneck > wouldn't it? What difference is it from AMD's ccHTT links except > they've got a few more? >A cpu has more than one layer. I'm not sure how many it has but I think AMD's is about 9 layers with the K8. I'd suspect the tile64 is a lot more. The interconnect would be similar to AMD's HT interconnect bus.> Or does he mean the 64 cores are all directly connected to each other... > meaning there are some 63 connections coming out of each core to every > other core for some mindboggling number? (I think 63+62+61+... but my > abysmal ability with maths fails me here) But essentially becoming a > nightmare if the number of cores go out. So unlikely to be case, no?Probably the reason the core speeds are kept a lot lower. I know I'd like to have one of these on a small MB compatable with an ATX/BTX case, but realisitically, I have no need for so much power. But cutting electrical use would be nice. -- Want the ultimate in free OTA SD/HDTV Recorder? http://mythtv.org http://mysettopbox.tv/knoppmyth.html Usenet alt.video.ptv.mythtv My server http://wesnewell.no-ip.com/cpu.php HD Tivo S3 compared http://wesnewell.no-ip.com/mythtivo.htm
Reply by ●October 12, 20072007-10-12
The little lost angel wrote:> On Thu, 11 Oct 2007 11:02:14 -0700, AirRaid <AirRaidJet@gmail.com> > wrote: > > >>Tilera to Introduce 64-Core Processor >>By Andy Patrizio >> >>"The real problem with scale is existing multi-core architectures use >>a bus. In that architecture, the bus is a central switch and all the >>cores are connected to the single central switch. A packet has to go >>through it no matter what, which is fine for one, two or four cores, >>but it does not scale," he told internetnews.com. >> >>Tilera uses a mesh architecture, where the cores are laid out in a >>checkerboard-like grid, all connected through high-speed >>interconnects. "In architectures of this sort, you can keep growing >>and you won't have any serious congestion," said Agarwal. > > > I'm a bit puzzled by this. If the cores are laid out in a checkerboard > like grid, doesn't that mean each core is linked to the 8 cores around > it? So it would still come up to some kind of latency bottleneck > wouldn't it? What difference is it from AMD's ccHTT links except > they've got a few more? > > Or does he mean the 64 cores are all directly connected to each > other... meaning there are some 63 connections coming out of each core > to every other core for some mindboggling number? (I think > 63+62+61+... but my abysmal ability with maths fails me here) But > essentially becoming a nightmare if the number of cores go out. So > unlikely to be case, no?It most likely borrows from FPGA interconnect routing. That uses cross point muxes, to give a number somewhere between your two limits. eg with a 64b interconnect, and crosspoints, any core can talk to any other core(s), 128b and you get duplex, and so on. but much less than 63! (1.98e87) -jg
Reply by ●October 12, 20072007-10-12
The little lost angel wrote:> On Thu, 11 Oct 2007 11:02:14 -0700, AirRaid <AirRaidJet@gmail.com> > wrote: > >> Tilera to Introduce 64-Core Processor >> By Andy Patrizio >> >> "The real problem with scale is existing multi-core architectures use >> a bus. In that architecture, the bus is a central switch and all the >> cores are connected to the single central switch. A packet has to go >> through it no matter what, which is fine for one, two or four cores, >> but it does not scale," he told internetnews.com. >> >> Tilera uses a mesh architecture, where the cores are laid out in a >> checkerboard-like grid, all connected through high-speed >> interconnects. "In architectures of this sort, you can keep growing >> and you won't have any serious congestion," said Agarwal. > > I'm a bit puzzled by this. If the cores are laid out in a checkerboard > like grid, doesn't that mean each core is linked to the 8 cores around > it? So it would still come up to some kind of latency bottleneck > wouldn't it?Dr. Agarwal appears to be using a bit of a strawman argument here; the Tilera mesh interconnect is indeed better than a bus, but I'm not aware of any CMPs that use a bus as their interconnect. Tilera's mesh interconnect probably has lower performance than a Niagara-style full crossbar, but it's also more scalable and probably less area. Wes Felter - wesley@felter.org
Reply by ●October 12, 20072007-10-12
On Oct 11, 1:02 pm, AirRaid <AirRaid...@gmail.com> wrote:> Tilera to Introduce 64-Core Processor > By Andy Patrizio > > An MIT-inspired startup will introduce a new multi-core chip today at > the annual Hot Chips conference at Stanford University. The TILE64 > boasts a "clean sheet" design, > unencumbered by any legacy > compatibility concerns,Which means no software is available.> The TILE64 processor contains 64 full-featured, programmable > cores that Tilera claims can perform 500 billion operations per second > and delivers ten times the performance and thirty times the > performance-per-watt of the Intel dual-core Xeon.> The TILE family can scale up to even more, or down to a two-core > design for the smallest of designs, such as a cell phone. Its power > consumption is a few hundred milliwatts per core, Agarwal said. Its > clock speed will range from 600MHz to 1GHz.64 processors at 1 GHz giving 500 GIPS means 8 IPC/core? or 1 IPC/core with eight 8-16-32-64-bit sub-operations per cycle?> But there's a lot more on the chip than just cores. It has a pair of > 10 gigabit Ethernet ports directly on the chip for high speed > networking, as well as on-board I/O and peripheral controllers. Its > integrated memory controllers allow for up to 200 gigabits of memory > bandwidth within the chip.200 Gbits per what unit of time? 500 GIPS should require somthing in the 100GBytes/sec to 500GBytes/sec range of external memory bandwidth. The only thing impressive, here, is the level of distortion.......