In comp.sys.ibm.pc.hardware.chips already5chosen@yahoo.com wrote in part:
> SRAM could shave off 15 ns in the case of DRAM page miss. Or
> 50-55ns in the case of page conflict, but those are very
> rare. In the supposedly most common case of DRAM page hit
> SRAM doesn't help at all.  Actually, you will have hard
> time finding commodity SRAMs that is as fast as now common
> DDR2-800 CL5 at page hit.

You are talking device response times, and I appreciate your
information.  However, I am interested in system response (software
performance), and my measurements are far less encouraging:

Latency    CPU@MHz mem.ctl RAM
  ns

  88       k8@2000 NForce3 DDR400
 144       P3@1000 laptop SO-PC133?
 148     2*P3@860  Serverworks ??
 178       P4@1800 i850   RDRAM
 184       K7@1667 SiS735 PC133
 185       P3@600  440BX  PC100
 217    2*Cel@500  440BX  PC90
 234       P2@350  440BX  PC100?
 288       P2@333  440BX  PC66

I do need to find & test some more modern systems, but I'm
underwhelmed by the slowness of latency improvement.  CPU has
increased min 4x, latency response at best 2.5x .  Run this
pgm from L2 (small set) and it comes back around 10 ns.



compile: $  gcc -O2 lat10m.c
run:     $  time ./a.out    [multiply user time by 100 to give ns]

/*  lat10m.c - Measure latency of 10 million fresh memory reads
  (C) Copyright 2005 Robert Redelmeier - GPL v2.0 licence granted  */
int  p[ 1<<21 ] ;
main (void) {
int i, j ;
for ( i=0   ; i < 1<<21   ; i++ )  p[i] = 0x1FFFFF & (i-5000) ;
for ( j=i=0 ; i < 9600000 ; i++ )     j = p[j] ;
return  j ; }


-- Robert

Ken Hagan wrote:
> On Fri, 07 Dec 2007 22:51:06 -0000, daytripper 
> <day_trippr@REMOVEyahoo.com> wrote:
> 
>> So, in short, you don't think the biggest problem confronting 
>> processor design and performance isn't important because "it's hard"...
>>
>> /daytripper (well, that's one way to go, I guess ;-)
> 
> I dunno if its a fair summary of Robert's position, but it is a fair
> piece of strategy. It is silly to try to solve an impossible problem.
> It is almost as silly to try to solve an almost impossible problem.


How about
    It's not important because it is not cost-effective?

-- 
A language that doesn't affect the way
you think about programming is
not worth knowing.
           Alan Perlis

On Dec 8, 9:18 am, Robert Redelmeier <red...@ev1.net.invalid> wrote:
> In comp.sys.ibm.pc.hardware.chips Robert Myers <rbmyers...@gmail.com> wrote in part:
>
> > For latency, there is nowhere left to go in terms of
> > completely unpredictable reads from memory (or disk).
>
> Sure there is -- SRAM and other designs which take more xtors
> per cell.  With the continually decreasing marginal cost
> of xtors and a shortage of useful things to do with them,
> I expect this transition to happen at some point.
>

SRAM could shave off 15 ns in the case of DRAM page miss. Or 50-55ns
in the case of page conflict, but those are very rare. In the
supposedly most common case of DRAM page hit SRAM doesn't help at all.
Actually, you will have hard time finding commodity SRAMs that is as
fast as now common DDR2-800 CL5 at page hit.
Another potential saving with SRAM comes from the fact that memory
controller is simpler. Don't know how much it could bring. The likes
of Opteron and Power6 run their MCs at very high speed so I'd guess it
would be hard to shave off more than 1-2 ns here.

Now look at the flop side:
1. Pins - SRAM address bus is up to twice wider than the DRAM. You can
construct SRAM with pseudo-pages and multiplexed address bus, but then
you give up on part of the latency advantage.
2. Capacity.  The big one. SRAM capacity lags behind DRAM by factor of
5-10. It means that you will either need more channels (expensive
motherboard, expensive packaging of MPU/NB; not always possible due to
mechanical constrains) or more DIMMs per channel. The later noramally
means more buffering = higher latency. For example, for DDR2-667 one
can put on one channel 2 unbuffered DIMMs (lowest latency), 4
registered DIMMs (medium latency) or up to 8 fully-buffered DIMMs (the
highest latency).
3. Power consumption. I'm not an expert in this area, but according to
my understanding under heavy load SRAM consumes 2-3 times more power
than the equivalent DRAM. That's partly compensated by lower idle
power consumption (no need for refresh).
4. Cost. That's the other unfortunate effect of lower capacity.

On Fri, 07 Dec 2007 22:51:06 -0000, daytripper  
<day_trippr@REMOVEyahoo.com> wrote:

> So, in short, you don't think the biggest problem confronting processor  
> design and performance isn't important because "it's hard"...
>
> /daytripper (well, that's one way to go, I guess ;-)

I dunno if its a fair summary of Robert's position, but it is a fair
piece of strategy. It is silly to try to solve an impossible problem.
It is almost as silly to try to solve an almost impossible problem.

In comp.sys.ibm.pc.hardware.chips Robert Myers <rbmyersusa@gmail.com> wrote in part:
> To turn your argument over, if latency were king, Intel would be
> out of business and/or have changed tactics drastically.  Intel has
> taken its own sweet time about moving away from its traditional
> memory architecture and seems to be doing quite nicely.

Your argument assumes Intel and AMD are identicial with respect
to market success.  They are NOT!  Intel is much larger and can
afford many mistakes.  AMD's production capacity is too small to
be any sort of real threat, at least in the short and medium term.

> That's a one-time gain that has been known to be available at
> least since the last editions of alpha.  

Sure.  But why not grab it?

> For latency, there is nowhere left to go in terms of
> completely unpredictable reads from memory (or disk).

Sure there is -- SRAM and other designs which take more xtors
per cell.  With the continually decreasing marginal cost
of xtors and a shortage of useful things to do with them,
I expect this transition to happen at some point.

> All the tactics that work (prefetch, hide, cache) depend
> on the ability to foresee the future, another hobby horse
> of mine.  Terje might claim that improvements come from
> cache management.  Improvements in cache management come
> from more successfully exploiting nonrandomness; that is
> to say, the ability to predict the future.

I agree with Terje and those things can be done in
addition to debottlenecking the circuit response.

-- Robert

On Dec 7, 5:51 pm, daytripper <day_tri...@REMOVEyahoo.com> wrote:

>
> So, in short, you don't think the biggest problem confronting processor design
> and performance isn't important because "it's hard"...
>
> /daytripper (well, that's one way to go, I guess ;-)- Hide quoted text -
>

If you have a need to make problems go dramatically faster, it isn't
going to happen through reducing latency.  A good processor design is
one that doesn't make the situation worse.  Within a factor of 2,
that's surely the best you can hope to do.

The only big knobs are bandwidth and predictability.  As for latency,
"it takes all the running you can do just to stay in the same place."

Robert.

On Fri, 7 Dec 2007 12:49:09 -0800 (PST), Robert Myers <rbmyersusa@gmail.com>
wrote:

>On Dec 7, 9:47 am, Robert Redelmeier <red...@ev1.net.invalid> wrote:
>> In comp.sys.ibm.pc.hardware.chips Robert Myers <rbmyers...@gmail.com> wrote in part:
>>
>> > On Dec 5, 12:31 am, daytripper <day_tri...@REMOVEyahoo.com> wrote
>> >> Of course, the first question they had was "what about latency?"
>>
>> > Bandwidth is king.  Said it long ago.  Wider is the only
>>
>> Uhm, err, for what sorts of problems/tasks?  Had bandwidth
>> been always and overall governing, Rambus first iteration
>> would have succeeded.  Their execs obviously thought they
>> had technical advantages worth the commercial conditions.
>> The market disagreed.
>>
>Rambus was hot and expensive.
>
>To turn your argument over, if latency were king, Intel would be out
>of business and/or have changed tactics drastically.  Intel has taken
>its own sweet time about moving away from its traditional memory
>architecture and seems to be doing quite nicely.
>
>> > way left to go.  We will see more and more of same and the
>> > only thing to do about latency is to hide it.
>>
>> This has often been tried with only partial success (video)
>> Sometimes latency governs and cannot be hidden (databases).
>> It must be reduced as AMD has done fairly successfully.
>>
>That's a one-time gain that has been known to be available at least
>since the last editions of alpha.  For latency, there is nowhere left
>to go in terms of completely unpredictable reads from memory (or
>disk).  All the tactics that work (prefetch, hide, cache) depend on
>the ability to foresee the future, another hobby horse of mine.  Terje
>might claim that improvements come from cache management.
>Improvements in cache management come from more successfully
>exploiting nonrandomness; that is to say, the ability to predict the
>future.
>
>Robert.

So, in short, you don't think the biggest problem confronting processor design
and performance isn't important because "it's hard"...

/daytripper (well, that's one way to go, I guess ;-)

On Dec 7, 9:47 am, Robert Redelmeier <red...@ev1.net.invalid> wrote:
> In comp.sys.ibm.pc.hardware.chips Robert Myers <rbmyers...@gmail.com> wrote in part:
>
> > On Dec 5, 12:31 am, daytripper <day_tri...@REMOVEyahoo.com> wrote
> >> Of course, the first question they had was "what about latency?"
>
> > Bandwidth is king.  Said it long ago.  Wider is the only
>
> Uhm, err, for what sorts of problems/tasks?  Had bandwidth
> been always and overall governing, Rambus first iteration
> would have succeeded.  Their execs obviously thought they
> had technical advantages worth the commercial conditions.
> The market disagreed.
>
Rambus was hot and expensive.

To turn your argument over, if latency were king, Intel would be out
of business and/or have changed tactics drastically.  Intel has taken
its own sweet time about moving away from its traditional memory
architecture and seems to be doing quite nicely.

> > way left to go.  We will see more and more of same and the
> > only thing to do about latency is to hide it.
>
> This has often been tried with only partial success (video)
> Sometimes latency governs and cannot be hidden (databases).
> It must be reduced as AMD has done fairly successfully.
>
That's a one-time gain that has been known to be available at least
since the last editions of alpha.  For latency, there is nowhere left
to go in terms of completely unpredictable reads from memory (or
disk).  All the tactics that work (prefetch, hide, cache) depend on
the ability to foresee the future, another hobby horse of mine.  Terje
might claim that improvements come from cache management.
Improvements in cache management come from more successfully
exploiting nonrandomness; that is to say, the ability to predict the
future.

Robert.

In comp.sys.ibm.pc.hardware.chips Robert Myers <rbmyersusa@gmail.com> wrote in part:
> On Dec 5, 12:31 am, daytripper <day_tri...@REMOVEyahoo.com> wrote
>> Of course, the first question they had was "what about latency?"
> 
> Bandwidth is king.  Said it long ago.  Wider is the only

Uhm, err, for what sorts of problems/tasks?  Had bandwidth
been always and overall governing, Rambus first iteration
would have succeeded.  Their execs obviously thought they
had technical advantages worth the commercial conditions.
The market disagreed.

> way left to go.  We will see more and more of same and the
> only thing to do about latency is to hide it.

This has often been tried with only partial success (video)
Sometimes latency governs and cannot be hidden (databases).
It must be reduced as AMD has done fairly successfully.

-- Robert