EmbeddedRelated.com
Forums
The 2024 Embedded Online Conference

Tilera to Introduce 64-Core Processor

Started by AirRaid October 11, 2007
Tilera to Introduce 64-Core Processor
By Andy Patrizio

An MIT-inspired startup will introduce a new multi-core chip today at
the annual Hot Chips conference at Stanford University. The TILE64
boasts a "clean sheet" design, unencumbered by any legacy
compatibility concerns, that Tilera says will provide a huge leap in
multithreaded performance.

Tilera was founded in 2004 to bring to market the multi-core processor
designs of MIT researcher Anant Agarwal. Agarwal created what he
called a "mesh" multi-core architecture, where the cores are all
interconnected rather than going through a frontside bus, as Intel's
multi-core chips do.

Agarwal first created this multi-core architecture in 1996, long
before Intel and AMD were anywhere close to doing it. The project
received funding from the Defense Advanced Research Project Agency
(DARPA) and the National Science Foundation, the agency that managed
the Internet for decades.

Tilera holds 40-plus patents for its multi-core design. TIL64 will be
the first in a series of processors built around massively multi-core
chips. The TILE64 processor contains 64 full-featured, programmable
cores that Tilera claims can perform 500 billion operations per second
and delivers ten times the performance and thirty times the
performance-per-watt of the Intel dual-core Xeon.

Agarwal said the company can make these performance leaps because it
doesn't use any legacy technologies or designs.

"The real problem with scale is existing multi-core architectures use
a bus. In that architecture, the bus is a central switch and all the
cores are connected to the single central switch. A packet has to go
through it no matter what, which is fine for one, two or four cores,
but it does not scale," he told internetnews.com.

Tilera uses a mesh architecture, where the cores are laid out in a
checkerboard-like grid, all connected through high-speed
interconnects. "In architectures of this sort, you can keep growing
and you won't have any serious congestion," said Agarwal.

Intel has promised to dispense with the frontside bus with the Nehalem
architecture, due late next year. AMD does not have a frontside bus in
the Opteron, but it's also using four cores at the most, while Tilera
is at 64.

The TILE family can scale up to even more, or down to a two-core
design for the smallest of designs, such as a cell phone. Its power
consumption is a few hundred milliwatts per core, Agarwal said. Its
clock speed will range from 600MHz to 1GHz.

But there's a lot more on the chip than just cores. It has a pair of
10 gigabit Ethernet ports directly on the chip for high speed
networking, as well as on-board I/O and peripheral controllers. Its
integrated memory controllers allow for up to 200 gigabits of memory
bandwidth within the chip.

That's what made the TILE64 chip so appealing to Top Layer, developer
of network security and intrusion detection appliance. The company had
built its own processors but now plans to switch to Tilera's chips,
according to Chief Strategy Officer Mike Paquette.

"Our software is a multi-core design, and we were able to map out
functionality almost 1 for 1 for each process to a core in a Tilera
chip," he said. "The performance we expect in our estimates exceeds
what we could have gotten from any silicon providers."

Top Layer decided to license processors for future products rather
than the expense of building any more, and no other processors had the
scalability. "Because the movement of data is so much of what we do,
we needed a multi-core chip that was optimized for what we were doing
rather than something optimized for general purpose computing Tilera
has capabilities for network capabilities that are far ahead of what
you can get from [x86] processors," said Paquette.

Tilera will ship a full development toolkit, called the Multicore
Development Environment (MDE), for building applications. It's an
Eclipse-based Integrated Development Environment (IDE) with an ANSI
standard C compiler, an application level library and tools for
debugging and profiling multi-core processors.

Wisely, Tilera is not taking on Intel and AMD right out of the gate,
as Transmeta did. It's going for the embedded market.

"We're focused on embedded because we are a startup and want to go
into a space where there is massive demand for performance like ours.
We can focus on a couple of markets and do really well in those
markets by addressing customer demands squarely and don't have to go
up against a dominant competitor," said Agarwal.

Tilera expects to sell the TILE64 processor for $435 in lots of 10,000
units. The company is also planning a 36-core and 120-core processor
for the near future.


http://www.internetnews.com/ent-news/article.php/3695116

AirRaid wrote:
> Tilera to Introduce 64-Core Processor > By Andy Patrizio
Oh great ... more marketing hyperbole spam from AirRaid.
On Oct 12, 5:45 am, "Michael N. Moran" <mnmo...@bellsouth.net> wrote:
> AirRaid wrote: > > Tilera to Introduce 64-Core Processor > > By Andy Patrizio > > Oh great ... more marketing hyperbole spam from AirRaid.
Why do you say that?Michael. Can you give your opinion? Jogging
joggingsong@gmail.com wrote:
> On Oct 12, 5:45 am, "Michael N. Moran" <mnmo...@bellsouth.net> wrote: >> AirRaid wrote: >>> Tilera to Introduce 64-Core Processor >>> By Andy Patrizio >> Oh great ... more marketing hyperbole spam from AirRaid. > > Why do you say that?Michael. Can you give your opinion? > > Jogging >
He just did....
[[followups-to trimmed to only comp.arch]]

In comp.arch AirRaid <AirRaidJet@gmail.com> wrote:
> The TILE64 processor contains 64 full-featured, programmable > cores that Tilera claims can perform 500 billion operations per second > and delivers ten times the performance and thirty times the > performance-per-watt of the Intel dual-core Xeon.
[[...]]
> integrated memory controllers allow for up to 200 gigabits of memory > bandwidth within the chip.
What about off-chip bandwidth -- can this keep up with 64 cores' cache misses? Is there a clever new door to get through the memory wall? -- -- "Jonathan Thornburg -- remove -animal to reply" <jthorn@soton.ac-zebra.uk> School of Mathematics, U of Southampton, England "Washing one's hands of the conflict between the powerful and the powerless means to side with the powerful, not to be neutral." -- quote by Freire / poster by Oxfam
On Thu, 11 Oct 2007 11:02:14 -0700, AirRaid <AirRaidJet@gmail.com>
wrote:

>Tilera to Introduce 64-Core Processor >By Andy Patrizio > >"The real problem with scale is existing multi-core architectures use >a bus. In that architecture, the bus is a central switch and all the >cores are connected to the single central switch. A packet has to go >through it no matter what, which is fine for one, two or four cores, >but it does not scale," he told internetnews.com. > >Tilera uses a mesh architecture, where the cores are laid out in a >checkerboard-like grid, all connected through high-speed >interconnects. "In architectures of this sort, you can keep growing >and you won't have any serious congestion," said Agarwal.
I'm a bit puzzled by this. If the cores are laid out in a checkerboard like grid, doesn't that mean each core is linked to the 8 cores around it? So it would still come up to some kind of latency bottleneck wouldn't it? What difference is it from AMD's ccHTT links except they've got a few more? Or does he mean the 64 cores are all directly connected to each other... meaning there are some 63 connections coming out of each core to every other core for some mindboggling number? (I think 63+62+61+... but my abysmal ability with maths fails me here) But essentially becoming a nightmare if the number of cores go out. So unlikely to be case, no? -- A Lost Angel, fallen from heaven Lost in dreams, Lost in aspirations, Lost to the world, Lost to myself
On Fri, 12 Oct 2007 16:16:32 +0000, The little lost angel wrote:

> I'm a bit puzzled by this. If the cores are laid out in a checkerboard > like grid, doesn't that mean each core is linked to the 8 cores around > it? So it would still come up to some kind of latency bottleneck > wouldn't it? What difference is it from AMD's ccHTT links except > they've got a few more? >
A cpu has more than one layer. I'm not sure how many it has but I think AMD's is about 9 layers with the K8. I'd suspect the tile64 is a lot more. The interconnect would be similar to AMD's HT interconnect bus.
> Or does he mean the 64 cores are all directly connected to each other... > meaning there are some 63 connections coming out of each core to every > other core for some mindboggling number? (I think 63+62+61+... but my > abysmal ability with maths fails me here) But essentially becoming a > nightmare if the number of cores go out. So unlikely to be case, no?
Probably the reason the core speeds are kept a lot lower. I know I'd like to have one of these on a small MB compatable with an ATX/BTX case, but realisitically, I have no need for so much power. But cutting electrical use would be nice. -- Want the ultimate in free OTA SD/HDTV Recorder? http://mythtv.org http://mysettopbox.tv/knoppmyth.html Usenet alt.video.ptv.mythtv My server http://wesnewell.no-ip.com/cpu.php HD Tivo S3 compared http://wesnewell.no-ip.com/mythtivo.htm
The little lost angel wrote:
> On Thu, 11 Oct 2007 11:02:14 -0700, AirRaid <AirRaidJet@gmail.com> > wrote: > > >>Tilera to Introduce 64-Core Processor >>By Andy Patrizio >> >>"The real problem with scale is existing multi-core architectures use >>a bus. In that architecture, the bus is a central switch and all the >>cores are connected to the single central switch. A packet has to go >>through it no matter what, which is fine for one, two or four cores, >>but it does not scale," he told internetnews.com. >> >>Tilera uses a mesh architecture, where the cores are laid out in a >>checkerboard-like grid, all connected through high-speed >>interconnects. "In architectures of this sort, you can keep growing >>and you won't have any serious congestion," said Agarwal. > > > I'm a bit puzzled by this. If the cores are laid out in a checkerboard > like grid, doesn't that mean each core is linked to the 8 cores around > it? So it would still come up to some kind of latency bottleneck > wouldn't it? What difference is it from AMD's ccHTT links except > they've got a few more? > > Or does he mean the 64 cores are all directly connected to each > other... meaning there are some 63 connections coming out of each core > to every other core for some mindboggling number? (I think > 63+62+61+... but my abysmal ability with maths fails me here) But > essentially becoming a nightmare if the number of cores go out. So > unlikely to be case, no?
It most likely borrows from FPGA interconnect routing. That uses cross point muxes, to give a number somewhere between your two limits. eg with a 64b interconnect, and crosspoints, any core can talk to any other core(s), 128b and you get duplex, and so on. but much less than 63! (1.98e87) -jg
The little lost angel wrote:
> On Thu, 11 Oct 2007 11:02:14 -0700, AirRaid <AirRaidJet@gmail.com> > wrote: > >> Tilera to Introduce 64-Core Processor >> By Andy Patrizio >> >> "The real problem with scale is existing multi-core architectures use >> a bus. In that architecture, the bus is a central switch and all the >> cores are connected to the single central switch. A packet has to go >> through it no matter what, which is fine for one, two or four cores, >> but it does not scale," he told internetnews.com. >> >> Tilera uses a mesh architecture, where the cores are laid out in a >> checkerboard-like grid, all connected through high-speed >> interconnects. "In architectures of this sort, you can keep growing >> and you won't have any serious congestion," said Agarwal. > > I'm a bit puzzled by this. If the cores are laid out in a checkerboard > like grid, doesn't that mean each core is linked to the 8 cores around > it? So it would still come up to some kind of latency bottleneck > wouldn't it?
Dr. Agarwal appears to be using a bit of a strawman argument here; the Tilera mesh interconnect is indeed better than a bus, but I'm not aware of any CMPs that use a bus as their interconnect. Tilera's mesh interconnect probably has lower performance than a Niagara-style full crossbar, but it's also more scalable and probably less area. Wes Felter - wesley@felter.org
On Oct 11, 1:02 pm, AirRaid <AirRaid...@gmail.com> wrote:
> Tilera to Introduce 64-Core Processor > By Andy Patrizio > > An MIT-inspired startup will introduce a new multi-core chip today at > the annual Hot Chips conference at Stanford University. The TILE64 > boasts a "clean sheet" design, > unencumbered by any legacy > compatibility concerns,
Which means no software is available.
> The TILE64 processor contains 64 full-featured, programmable > cores that Tilera claims can perform 500 billion operations per second > and delivers ten times the performance and thirty times the > performance-per-watt of the Intel dual-core Xeon.
> The TILE family can scale up to even more, or down to a two-core > design for the smallest of designs, such as a cell phone. Its power > consumption is a few hundred milliwatts per core, Agarwal said. Its > clock speed will range from 600MHz to 1GHz.
64 processors at 1 GHz giving 500 GIPS means 8 IPC/core? or 1 IPC/core with eight 8-16-32-64-bit sub-operations per cycle?
> But there's a lot more on the chip than just cores. It has a pair of > 10 gigabit Ethernet ports directly on the chip for high speed > networking, as well as on-board I/O and peripheral controllers. Its > integrated memory controllers allow for up to 200 gigabits of memory > bandwidth within the chip.
200 Gbits per what unit of time? 500 GIPS should require somthing in the 100GBytes/sec to 500GBytes/sec range of external memory bandwidth. The only thing impressive, here, is the level of distortion.......

The 2024 Embedded Online Conference