"Ronald H. Nicholson Jr." <rhn@mauve.rahul.net> wrote in message
news:ctf9ic$n4c$1@blue.rahul.net...
> In article <ct3tks$8gj$1@news01.intel.com>,
> Jeremy Williamson <jeremiah.d.williamson@NOSPAMintel.com> wrote:
> >
> >"Ronald H. Nicholson Jr." <rhn@mauve.rahul.net> wrote in message
> >news:cssfcg$fva$1@blue.rahul.net...
> >> In article <name99-42B264.17530821012005@localhost>,
> >> Maynard Handley  <name99@name99.org> wrote:
> >> >Bottom line is that this thing doesn't resemble any traditional CPU
and
> >> >is therefore a godawful match to existing languages, compilers and
> >> >algorithms.
> >>
> >> GPU shader algorithms and languages?  Common DSP library/toolbox calls?
> ...
> >You do realize that this will still have a GPU, and as a matter of fact
Sony
> >gave up the idea of doing it themselves and gave the contract to nVidia
(my
> >guess was cost vs. bowing to requests from the SW community).
Essentially
> >that means all your basic T&L (including your shader algorithms) are
still
> >done on the GPU.
>
> Yes, but aren't people experimenting with using shader and DSP languages
> and tools for stuff that has nothing to do with the workstation display
> or audio output?
>
> The question is whether this software is commercially
> interesting and whether this cell device is more suited for this stuff
> than the GPU's on which these algorithm were developed.
>
>
> IMHO. YMMV.
> -- 
> Ron Nicholson   rhn AT nicholson DOT com   http://www.nicholson.com/rhn/
> #include <canonical.disclaimer>        // only my own opinions, etc.


Yes, especially since GPUs are slowly becoming more generically programmable
(due to the pixel and vertex shaders).  AIUI, the next gen is likely to have
primitive branching.  There was a full day workshop on porting apps to the
GPU at SIGGRAPH last year.   There was even a published paper of someone
porting a database to the GPU.

But, what I was trying to say is the Cell is not a GPU nor is it likely to
take away many of the tasks currently farmed to today's GPUs (T&L).

Jeremy

In article <ct3tks$8gj$1@news01.intel.com>,
Jeremy Williamson <jeremiah.d.williamson@NOSPAMintel.com> wrote:
>
>"Ronald H. Nicholson Jr." <rhn@mauve.rahul.net> wrote in message
>news:cssfcg$fva$1@blue.rahul.net...
>> In article <name99-42B264.17530821012005@localhost>,
>> Maynard Handley  <name99@name99.org> wrote:
>> >Bottom line is that this thing doesn't resemble any traditional CPU and
>> >is therefore a godawful match to existing languages, compilers and
>> >algorithms.
>>
>> GPU shader algorithms and languages?  Common DSP library/toolbox calls?
...
>You do realize that this will still have a GPU, and as a matter of fact Sony
>gave up the idea of doing it themselves and gave the contract to nVidia (my
>guess was cost vs. bowing to requests from the SW community).  Essentially
>that means all your basic T&L (including your shader algorithms) are still
>done on the GPU.

Yes, but aren't people experimenting with using shader and DSP languages
and tools for stuff that has nothing to do with the workstation display
or audio output?  The question is whether this software is commercially
interesting and whether this cell device is more suited for this stuff
than the GPU's on which these algorithm were developed.


IMHO. YMMV.
-- 
Ron Nicholson   rhn AT nicholson DOT com   http://www.nicholson.com/rhn/ 
#include <canonical.disclaimer>        // only my own opinions, etc.

"Ketil Malde" <ketil+news@ii.uib.no> wrote in message
news:egu0p7f6aq.fsf@ii.uib.no...
> "Xenon" <xenonxbox2@xboxnext.com> writes:
>
> > Cell Architecture Explained: Introduction
>    [...]
> > 250 GFLOPS (Billion Floating Point Operations per Second)
>    [...]
> > 6.4 Gigabit / second off-chip communication
>
> A little bit memory starved, I guess -- or do you have an application
> that performs in the neighborhood of fifty FLOPS per *bit*?
>
> -kzm
> -- 
> If I haven't seen further, it is by standing in the footprints of giants


GPUs do.  Stages and stages of pure logic circuitry.  On this beast, every
CPU/APU would have to be executing out of cache of course.

Memory starved is par for the course.  One of those issues that increases as
we move forward.


J

"Ronald H. Nicholson Jr." <rhn@mauve.rahul.net> wrote in message
news:cssfcg$fva$1@blue.rahul.net...
> In article <name99-42B264.17530821012005@localhost>,
> Maynard Handley  <name99@name99.org> wrote:
> >Bottom line is that this thing doesn't resemble any traditional CPU and
> >is therefore a godawful match to existing languages, compilers and
> >algorithms.
>
> GPU shader algorithms and languages?  Common DSP library/toolbox calls?
>
> -- 
> Ron Nicholson   rhn AT nicholson DOT com   http://www.nicholson.com/rhn/
> #include <canonical.disclaimer>        // only my own opinions, etc.

???

You do realize that this will still have a GPU, and as a matter of fact Sony
gave up the idea of doing it themselves and gave the contract to nVidia (my
guess was cost vs. bowing to requests from the SW community).  Essentially
that means all your basic T&L (including your shader algorithms) are still
done on the GPU.

J

Xenon <xenonxbox2@xboxnext.com> wrote:
>Cell Architecture Explained: Introduction

	A discussion at Joystiq points out that Mr. Blachford, from who 
you stole the article, also explained how to make an antigravity device, 
and how light reduces in frequency the further it travels...

http://www.blachford.info/quantum/gravity.html
http://www.blachford.info/quantum/dimeng.html

follow-ups set to rgv.sony; apologies for this idiot crossposter to the
rest of you all.

-- 
"It's only now, with "Blinded by the Right," that conservatives have grown
a sense of journalistic skepticism when it comes to [David] Brock."

 - "Fight or Flight", David Talbot, Salon Apr 17 2002

> In alt.games.video.xbox CEO Gargantua <gamers@r.lamers> wrote:

This high authority on the issue spake:
> Moore's Law is dead, and it's taking Wintel down with it.

Pantagruel sez:  Then they'll have to start making the software faster.
Won't that be a hoot!

http://www.centaurgalleries.com/Art/00077/I04274-02-500h.jpg

In alt.games.video.xbox CEO Gargantua <gamers@r.lamers> wrote:

> Moore's Law is dead, and it's taking Wintel down with it.

But is it even needed anymore?

Think about it - how many people *NEED* that 3+Ghz processor?  And the 
people who do actually need that sort of power are going with SMP or 
paralell processing systems already.  Even Doom3 is more concerned with 
the processor and memory of the graphics card - not with your main CPU.

If anything, the days of the single CPU system are numbered.  If you can't 
make the individual chip go faster, why not throw more chips at the 
problem?  This won't lead to a speed-up across the board, but imagine 
being able to dedicate a processor to each application you're running on 
your system?  BeOS used to be able to do this and even let you set how 
many processors you wanted dedicated to each process if you wanted.  Just 
don't set it to 0 CPUs for the OS...bad things would happen ;)

As for margins on its chips, I wouldn't worry about Intel just yet.  If 
they start doing multiple core processors (one chip, 2 or more CPUs) which 
will push their prices along nicely.  After all, there's only so much room 
on a standard desktop ATX motherboard.

Ketil Malde <ketil+news@ii.uib.no> writes:

>"Xenon" <xenonxbox2@xboxnext.com> writes:

>> Cell Architecture Explained: Introduction
>   [...]
>> 250 GFLOPS (Billion Floating Point Operations per Second)
>   [...]
>> 6.4 Gigabit / second off-chip communication

>A little bit memory starved, I guess -- or do you have an application
>that performs in the neighborhood of fifty FLOPS per *bit*?

The claim made in the Cell paper is that there are 8 Rambus XDR channels at
3.2 GB/s each for a total of 25.6 GB/s (I could have sworn XDR was supposed
to be 6.4 GB/s, but maybe that was down the road)  That 6.4 GB/s off chip
communication is the hypertransport equivalent (and also supposed to be per
pin, can't remember how wide that was supposed to be)  Not that this gets
it near 250 GLOPS usable for problems larger than a few megabytes.

-- 
Douglas Siebert                          dsiebert@excisethis.khamsin.net

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety" -- Thomas Jefferson

"Xenon" <xenonxbox2@xboxnext.com> writes:

> Cell Architecture Explained: Introduction
   [...]
> 250 GFLOPS (Billion Floating Point Operations per Second)
   [...]
> 6.4 Gigabit / second off-chip communication

A little bit memory starved, I guess -- or do you have an application
that performs in the neighborhood of fifty FLOPS per *bit*?

-kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants

On Sat, 22 Jan 2005 01:53:48 GMT, Maynard Handley <name99@name99.org>
wrote:

>In article <QvadnatFwfzw6W3cRVn-ow@comcast.com>,
> "Xenon" <xenonxbox2@xboxnext.com> wrote:
>
>> The lack of cache and virtual memory systems means the APUs operate in a
>> different way from conventional CPUs. This will likely make them harder to
>                                                                   ^^^^^
>> program but they have been designed this way to reduce complexity and
>> increase performance.
>
>You don't say.
>Programming Itanic was a picnic compared to programming this thing; at 
>least Itanic used a traditional computer architecture.
>And yet Intel/HP, with all the money in the world, couldn't make it fly.
>Please tell us why IBM/Sony/Toshiba can do what Intel/HP could not.
>

Itanium and Cell both offer advantages for problems that can be
formulated to exploit the architecture.  In the case of Itanium, the
advantages have turned out not to be overwhelming.  In the case of
stream processors, there are already off-the-shelf GPU's that can
significantly outperform any conventional microprocessor for some
kinds of problems, and the advantage of stream processors will only
grow as feature sizes decrease.

>(Note, I am not denying that Cell may make a fine Playstation chip. 
>I AM denying that it will make a fine workstation chip, will take over 
>the computing world, make all other CPUs obsolete, blah blah blah.)
>

Predicting the future is really hard.  Genuine paradigm shifts are
rare, but I think this one is on its way.  The future of computing is
more like what happens on network processors and GPU's than what
happens on x86, PowerPC, or Itanium.  The change is being driven by
physics, not marketing.

>> This may sound like an inflexible system which will be complex to program
>> and it most likely is but this system will deliver data to the APU registers
>
>So in return for giving up cache, your code has to manually move data 
>to/from memory. That'll be easy for the compiler to figure out. 
>

Of course it won't.  But the same problem exists--how do I figure out
how to get the data to where I need it when I need it?--in any
architecture.  Cache and registers add a set of tools for dealing with
that problem; they don't make it go away.  In the case of at least
some stream processors, there is a _register_ hierarchy: a
low-bandwidth stream register file that faces memory and local
register files that act much like a conventional vector register.

<snip>

>
>There's (much much, so much) more blather and ranting about how how 
>fantastic Cell is and how it will solve any problem you can possibly 
>imagine, but for those of us in the reality-based community, I think the 
>points I have extracted above are the highlights.
>
>Bottom line is that this thing doesn't resemble any traditional CPU and 
>is therefore a godawful match to existing languages, compilers and 
>algorithms. Unless IBM/Sony/Toshiba have, in some other pocket, and kept 
>an extremely good secret that solves problems many people have been 
>working on for more than twenty years, you'll be programming this thing 
>with an assembly language mindset, even if you are nominally using a 
>high-level language --- like you program AltiVec today. Only it'll be so 
>much more fun because not only will you be worrying about alignment and 
>algorithm issues, you'll be trying to juggle fitting your instructions 
>and data into local memory (we weren't given a size for this but if it 
>is to run at L1 cache speeds, it can't be wildly far off from say 64K to 
>512K bytes); none of that getting the cache to just hide the problem for 
>you if you might want to load from an infrequent used table, handle a 
>rare exception condition or whatever; it'll be manual segment swapping 
>all over again. Not to mention the other glorious aspects. You'll be 
>using some bizarro method to handle coherency. You'll have the engine 
>that drives your code and handles exceptions and such running on a 
>different processor from where the compute intensive code lives. 
>

Maybe.  Somebody likes programming these things because people are
already doing it--just for fun, apparently.

The problems are formidable, but it is early days yet when it comes to
inventing programming models and algorithms for stream processors.

One future I can see is that data (and instructions) will no longer be
associated with memory locations but with labelled packets.

There will always be something that looks like a conventional
microprocessor?  Let's wait and see what the promised workstations
look like.  Weren't we supposed to have seen them last fall?

The one thing in all this that _really_ gives me pause is that making
it work in the general case seems like getting a dataflow machine to
work in the general case.

There's a really nice summary of GPU programming entering the
mainstream at

http://www.computer.org/computer/homepage/1003/entertainment/

>And all this from IBM/Sony/Toshiba, three companies traditionally known 
>for their openness and willingness to share with the public. I imagine 
>Intel, AMD and Microsoft are quaking in their boots.
>

They couldn't possibly be less open than the graphics card
manufacturers have been.

RM