This list is for discussion of the design and implementation of field-programmable gate array based processors and integrated systems. It is also for discussion and community support of the XSOC Project (see http://www.fpgacpu.org/xsoc).
|
Hi I stumbled over the following, interessting problem: If a CPU accesses external devices via memory mapped IO, the order of the memory accesses is (or might be) important. Accesses should be done in the same order as specified in the assembler program. Consider a superscalar CPU with out-of-order execution. When using memory mapped IO, how do you prevent the CPU from reordering the memory access instructions that are associated with memory mapped IO? Is there a common technique used to get solve this issue? How do you discriminate the memory access instructions used for normal memory transfers, that might be reordered, from accesses to memory mapped devices that must not be reordered? Best regards, Christian -- Christian Plessl <> |
|
|
|
Hello, Question was interesting. After a little bit of search found the link below. http://www.intel.com/design/intarch/techinfo/pentium/inout.htm#12294 Since the above info will be architecture specific, you may also check how segmentation and paging works in general. Any operating systems textbook should be helpful in that. Regards, Betul On Tue, 27 Aug 2002, Christian Plessl wrote: > Hi > > I stumbled over the following, interessting problem: > > If a CPU accesses external devices via memory mapped IO, the order of the > memory accesses is (or might be) important. Accesses should be done in the > same order as specified in the assembler program. > > Consider a superscalar CPU with out-of-order execution. When using memory > mapped IO, how do you prevent the CPU from reordering the memory access > instructions that are associated with memory mapped IO? Is there a common > technique used to get solve this issue? How do you discriminate the memory > access instructions used for normal memory transfers, that might be > reordered, from accesses to memory mapped devices that must not be reordered? > > Best regards, > Christian -- God willing: When a Muslim wishes to plan for the future, when he promises, when he makes resolutions, and when he makes a pledge, he makes them with permission and the will of Allah. For this reason, a Muslim uses the "Qur'anic instructions" by saying "In Sha ' Allah." The meaning of this statement is: "If Allah wills." Muslims are to strive hard and to put their trusts with Allah. They leave the results in the hands of Allah. |
|
|
|
> Consider a superscalar CPU with out-of-order execution. When using > memory mapped IO, how do you prevent the CPU from reordering the > memory access instructions that are associated with memory mapped IO? Some architectures do this using explicit memory barrier instructions. There may be barrier instructions that only affect data reads, data writes, combinations, or more exotic variants. Some architectures have overloaded the NOP instruction to serve as a memory barrier, but this seems like a very poor choice. |
|
I guess on a simplistic level, a mechanism must be provided that inhibits such optimisations or one provides an independent IO buss. With an inhibit bit, a compiler could make use of it and provide a decorator. Ther compiler could pass this down in the compile to the back end and optimise to ensure out-of-order optimisation wouldn't occur. This then bings us to the point where the compiler, given a sufficient model of the core could optimse the out-of-order optimisations away and then you wouldn't have the IO access problem. If the core can optimse with out-of-order execution, then does it not follow that the compiler can do nearly as good a job? -- Veronica Merryfield, somewhere in Cambridgeshire, UK "The best things in life aren't things" ----- Original Message ----- From: "Christian Plessl" <> To: <> Sent: Tuesday, August 27, 2002 1:18 PM Subject: [fpga-cpu] Instruction reordering and memory mapped IO > Hi > > I stumbled over the following, interessting problem: > > If a CPU accesses external devices via memory mapped IO, the order of the > memory accesses is (or might be) important. Accesses should be done in the > same order as specified in the assembler program. > > Consider a superscalar CPU with out-of-order execution. When using memory > mapped IO, how do you prevent the CPU from reordering the memory access > instructions that are associated with memory mapped IO? Is there a common > technique used to get solve this issue? How do you discriminate the memory > access instructions used for normal memory transfers, that might be > reordered, from accesses to memory mapped devices that must not be reordered? > > Best regards, > Christian > > -- > Christian Plessl <> > > To post a message, send it to: > To unsubscribe, send a blank message to: |
|
|
|
> If the core can optimse with out-of-order execution, then does it not > follow that the compiler can do nearly as good a job? Not necessarily. The processor core's fetch unit can base decisions on the actual execution flow. A compiler can do varying degress of static analysis, but the results may not be as good as what can be achieved with dynamic scheduling. |
|
Indeed, look at Itanium2. The performance diffence is substancial between I1 and I2, the big changes were: more memory bandwidth from both bus and cache 2 more execution units a lot of dynamic logic of the flavor common in other out of order super-scalar processors that last one is a bit of a surprise since the VLIW dogma seems to be "with the right architecture, static schedualing is enough". I'm wondering if this is a particularity of just EPIC, or if this might be an indication that VLIW may not deliver on it's promise. > A compiler can do varying degress of static > analysis, but the results may not be as good as what can be achieved > with dynamic scheduling. |
|
On Tuesday 27 August 2002 19:40, Betul Buyukkurt wrote: > [....] > > Question was interesting. After a little bit of search found the link > below. > > http://www.intel.com/design/intarch/techinfo/pentium/inout.htm#12294 > > Since the above info will be architecture specific, you may also check > how segmentation and paging works in general. Any operating systems > textbook should be helpful in that. Interesting pointer, thanks. If I got this right, this means that Intel Pentiumf Pro does never reorder IO instructions. And memory-access instructions are only reordered, if the addresses fall into a cachable area. Consequently, in the instruction issuing-stage the CPU will have to check that for all pending loads/stores no reordering occurs if they are in a non-cachable area as defined in the MTTR. Sounds quite complex. Is this the usual way to handle this problem? Best regards, Chris -- Christian Plessl <> |
|
Hi Some architectures have non-cachable memory (MIPS for example) space, usually used for memory mapped IO devices. I guess it's not a problem to flush write buffers before accessing non-cachable memory. Best regards, ********************************** dipl.ing. Domagoj Babic FER University, RASIP dep. email: tel : +385 1 6129619 addr : Unska 3, 10000 Zagreb Croatia ********************************* On Tue, 27 Aug 2002, Christian Plessl wrote: > Hi > > I stumbled over the following, interessting problem: > > If a CPU accesses external devices via memory mapped IO, the order of the > memory accesses is (or might be) important. Accesses should be done in the > same order as specified in the assembler program. > > Consider a superscalar CPU with out-of-order execution. When using memory > mapped IO, how do you prevent the CPU from reordering the memory access > instructions that are associated with memory mapped IO? Is there a common > technique used to get solve this issue? How do you discriminate the memory > access instructions used for normal memory transfers, that might be > reordered, from accesses to memory mapped devices that must not be reordered? > > Best regards, > Christian > > -- > Christian Plessl <> > > To post a message, send it to: > To unsubscribe, send a blank message to: > > |