Reply by December 11, 20072007-12-11
In comp.sys.ibm.pc.hardware.chips already5chosen@yahoo.com wrote in part:
> SRAM could shave off 15 ns in the case of DRAM page miss. Or > 50-55ns in the case of page conflict, but those are very > rare. In the supposedly most common case of DRAM page hit > SRAM doesn't help at all. Actually, you will have hard > time finding commodity SRAMs that is as fast as now common > DDR2-800 CL5 at page hit.
You are talking device response times, and I appreciate your information. However, I am interested in system response (software performance), and my measurements are far less encouraging: Latency CPU@MHz mem.ctl RAM ns 88 k8@2000 NForce3 DDR400 144 P3@1000 laptop SO-PC133? 148 2*P3@860 Serverworks ?? 178 P4@1800 i850 RDRAM 184 K7@1667 SiS735 PC133 185 P3@600 440BX PC100 217 2*Cel@500 440BX PC90 234 P2@350 440BX PC100? 288 P2@333 440BX PC66 I do need to find & test some more modern systems, but I'm underwhelmed by the slowness of latency improvement. CPU has increased min 4x, latency response at best 2.5x . Run this pgm from L2 (small set) and it comes back around 10 ns. compile: $ gcc -O2 lat10m.c run: $ time ./a.out [multiply user time by 100 to give ns] /* lat10m.c - Measure latency of 10 million fresh memory reads (C) Copyright 2005 Robert Redelmeier - GPL v2.0 licence granted */ int p[ 1<<21 ] ; main (void) { int i, j ; for ( i=0 ; i < 1<<21 ; i++ ) p[i] = 0x1FFFFF & (i-5000) ; for ( j=i=0 ; i < 9600000 ; i++ ) j = p[j] ; return j ; } -- Robert
Reply by John Ahlstrom December 10, 20072007-12-10
Ken Hagan wrote:
> On Fri, 07 Dec 2007 22:51:06 -0000, daytripper > <day_trippr@REMOVEyahoo.com> wrote: > >> So, in short, you don't think the biggest problem confronting >> processor design and performance isn't important because "it's hard"... >> >> /daytripper (well, that's one way to go, I guess ;-) > > I dunno if its a fair summary of Robert's position, but it is a fair > piece of strategy. It is silly to try to solve an impossible problem. > It is almost as silly to try to solve an almost impossible problem.
How about It's not important because it is not cost-effective? -- A language that doesn't affect the way you think about programming is not worth knowing. Alan Perlis
Reply by December 10, 20072007-12-10
On Dec 8, 9:18 am, Robert Redelmeier <red...@ev1.net.invalid> wrote:
> In comp.sys.ibm.pc.hardware.chips Robert Myers <rbmyers...@gmail.com> wrote in part: > > > For latency, there is nowhere left to go in terms of > > completely unpredictable reads from memory (or disk). > > Sure there is -- SRAM and other designs which take more xtors > per cell. With the continually decreasing marginal cost > of xtors and a shortage of useful things to do with them, > I expect this transition to happen at some point. >
SRAM could shave off 15 ns in the case of DRAM page miss. Or 50-55ns in the case of page conflict, but those are very rare. In the supposedly most common case of DRAM page hit SRAM doesn't help at all. Actually, you will have hard time finding commodity SRAMs that is as fast as now common DDR2-800 CL5 at page hit. Another potential saving with SRAM comes from the fact that memory controller is simpler. Don't know how much it could bring. The likes of Opteron and Power6 run their MCs at very high speed so I'd guess it would be hard to shave off more than 1-2 ns here. Now look at the flop side: 1. Pins - SRAM address bus is up to twice wider than the DRAM. You can construct SRAM with pseudo-pages and multiplexed address bus, but then you give up on part of the latency advantage. 2. Capacity. The big one. SRAM capacity lags behind DRAM by factor of 5-10. It means that you will either need more channels (expensive motherboard, expensive packaging of MPU/NB; not always possible due to mechanical constrains) or more DIMMs per channel. The later noramally means more buffering = higher latency. For example, for DDR2-667 one can put on one channel 2 unbuffered DIMMs (lowest latency), 4 registered DIMMs (medium latency) or up to 8 fully-buffered DIMMs (the highest latency). 3. Power consumption. I'm not an expert in this area, but according to my understanding under heavy load SRAM consumes 2-3 times more power than the equivalent DRAM. That's partly compensated by lower idle power consumption (no need for refresh). 4. Cost. That's the other unfortunate effect of lower capacity.
Reply by Ken Hagan December 10, 20072007-12-10
On Fri, 07 Dec 2007 22:51:06 -0000, daytripper  
<day_trippr@REMOVEyahoo.com> wrote:

> So, in short, you don't think the biggest problem confronting processor > design and performance isn't important because "it's hard"... > > /daytripper (well, that's one way to go, I guess ;-)
I dunno if its a fair summary of Robert's position, but it is a fair piece of strategy. It is silly to try to solve an impossible problem. It is almost as silly to try to solve an almost impossible problem.
Reply by December 8, 20072007-12-08
In comp.sys.ibm.pc.hardware.chips Robert Myers <rbmyersusa@gmail.com> wrote in part:
> To turn your argument over, if latency were king, Intel would be > out of business and/or have changed tactics drastically. Intel has > taken its own sweet time about moving away from its traditional > memory architecture and seems to be doing quite nicely.
Your argument assumes Intel and AMD are identicial with respect to market success. They are NOT! Intel is much larger and can afford many mistakes. AMD's production capacity is too small to be any sort of real threat, at least in the short and medium term.
> That's a one-time gain that has been known to be available at > least since the last editions of alpha.
Sure. But why not grab it?
> For latency, there is nowhere left to go in terms of > completely unpredictable reads from memory (or disk).
Sure there is -- SRAM and other designs which take more xtors per cell. With the continually decreasing marginal cost of xtors and a shortage of useful things to do with them, I expect this transition to happen at some point.
> All the tactics that work (prefetch, hide, cache) depend > on the ability to foresee the future, another hobby horse > of mine. Terje might claim that improvements come from > cache management. Improvements in cache management come > from more successfully exploiting nonrandomness; that is > to say, the ability to predict the future.
I agree with Terje and those things can be done in addition to debottlenecking the circuit response. -- Robert
Reply by Robert Myers December 7, 20072007-12-07
On Dec 7, 5:51 pm, daytripper <day_tri...@REMOVEyahoo.com> wrote:

> > So, in short, you don't think the biggest problem confronting processor design > and performance isn't important because "it's hard"... > > /daytripper (well, that's one way to go, I guess ;-)- Hide quoted text - >
If you have a need to make problems go dramatically faster, it isn't going to happen through reducing latency. A good processor design is one that doesn't make the situation worse. Within a factor of 2, that's surely the best you can hope to do. The only big knobs are bandwidth and predictability. As for latency, "it takes all the running you can do just to stay in the same place." Robert.
Reply by daytripper December 7, 20072007-12-07
On Fri, 7 Dec 2007 12:49:09 -0800 (PST), Robert Myers <rbmyersusa@gmail.com>
wrote:

>On Dec 7, 9:47 am, Robert Redelmeier <red...@ev1.net.invalid> wrote: >> In comp.sys.ibm.pc.hardware.chips Robert Myers <rbmyers...@gmail.com> wrote in part: >> >> > On Dec 5, 12:31 am, daytripper <day_tri...@REMOVEyahoo.com> wrote >> >> Of course, the first question they had was "what about latency?" >> >> > Bandwidth is king. Said it long ago. Wider is the only >> >> Uhm, err, for what sorts of problems/tasks? Had bandwidth >> been always and overall governing, Rambus first iteration >> would have succeeded. Their execs obviously thought they >> had technical advantages worth the commercial conditions. >> The market disagreed. >> >Rambus was hot and expensive. > >To turn your argument over, if latency were king, Intel would be out >of business and/or have changed tactics drastically. Intel has taken >its own sweet time about moving away from its traditional memory >architecture and seems to be doing quite nicely. > >> > way left to go. We will see more and more of same and the >> > only thing to do about latency is to hide it. >> >> This has often been tried with only partial success (video) >> Sometimes latency governs and cannot be hidden (databases). >> It must be reduced as AMD has done fairly successfully. >> >That's a one-time gain that has been known to be available at least >since the last editions of alpha. For latency, there is nowhere left >to go in terms of completely unpredictable reads from memory (or >disk). All the tactics that work (prefetch, hide, cache) depend on >the ability to foresee the future, another hobby horse of mine. Terje >might claim that improvements come from cache management. >Improvements in cache management come from more successfully >exploiting nonrandomness; that is to say, the ability to predict the >future. > >Robert.
So, in short, you don't think the biggest problem confronting processor design and performance isn't important because "it's hard"... /daytripper (well, that's one way to go, I guess ;-)
Reply by Robert Myers December 7, 20072007-12-07
On Dec 7, 9:47 am, Robert Redelmeier <red...@ev1.net.invalid> wrote:
> In comp.sys.ibm.pc.hardware.chips Robert Myers <rbmyers...@gmail.com> wrote in part: > > > On Dec 5, 12:31 am, daytripper <day_tri...@REMOVEyahoo.com> wrote > >> Of course, the first question they had was "what about latency?" > > > Bandwidth is king. Said it long ago. Wider is the only > > Uhm, err, for what sorts of problems/tasks? Had bandwidth > been always and overall governing, Rambus first iteration > would have succeeded. Their execs obviously thought they > had technical advantages worth the commercial conditions. > The market disagreed. >
Rambus was hot and expensive. To turn your argument over, if latency were king, Intel would be out of business and/or have changed tactics drastically. Intel has taken its own sweet time about moving away from its traditional memory architecture and seems to be doing quite nicely.
> > way left to go. We will see more and more of same and the > > only thing to do about latency is to hide it. > > This has often been tried with only partial success (video) > Sometimes latency governs and cannot be hidden (databases). > It must be reduced as AMD has done fairly successfully. >
That's a one-time gain that has been known to be available at least since the last editions of alpha. For latency, there is nowhere left to go in terms of completely unpredictable reads from memory (or disk). All the tactics that work (prefetch, hide, cache) depend on the ability to foresee the future, another hobby horse of mine. Terje might claim that improvements come from cache management. Improvements in cache management come from more successfully exploiting nonrandomness; that is to say, the ability to predict the future. Robert.
Reply by December 7, 20072007-12-07
In comp.sys.ibm.pc.hardware.chips Robert Myers <rbmyersusa@gmail.com> wrote in part:
> On Dec 5, 12:31 am, daytripper <day_tri...@REMOVEyahoo.com> wrote >> Of course, the first question they had was "what about latency?" > > Bandwidth is king. Said it long ago. Wider is the only
Uhm, err, for what sorts of problems/tasks? Had bandwidth been always and overall governing, Rambus first iteration would have succeeded. Their execs obviously thought they had technical advantages worth the commercial conditions. The market disagreed.
> way left to go. We will see more and more of same and the > only thing to do about latency is to hide it.
This has often been tried with only partial success (video) Sometimes latency governs and cannot be hidden (databases). It must be reduced as AMD has done fairly successfully. -- Robert
Reply by Robert Myers December 6, 20072007-12-06
On Dec 5, 12:31 am, daytripper <day_tri...@REMOVEyahoo.com> wrote
> > Of course, the first question they had was "what about latency?" > > /daytripper
Bandwidth is king. Said it long ago. Wider is the only way left to go. We will see more and more of same and the only thing to do about latency is to hide it. Robert.