EmbeddedRelated.com
Forums

Speaking of Multiprocessing...

Started by rickman March 23, 2017
On 3/26/2017 11:28 AM, David Brown wrote:
> On 25/03/17 20:27, rickman wrote: >> On 3/25/2017 10:25 AM, David Brown wrote: > >>> Well, you are the one implementing this - so you have to figure out what >>> solution makes most sense for you. Here's another idea you could >>> consider. >>> >>> If you are dealing with just one CPU here, I have always thought a >>> "disable interrupts for the next X instructions then restore interrupt >>> status" instruction would be handy - with X being something like 4. That >>> would let you do atomic reads, writes or read-modify-write instructions >>> covering at least two memory addresses, without need for special memory >>> or read-write-modify opcodes. >> >> How is that different from instructions to enable and disable >> interrupts? This doesn't really help me as there are N logical CPUs. >> They just share the same hardware. But they all run concurrently in >> nearly every sense. They just use different clock cycles so that memory >> accesses are not literally concurrent. So interrupts aren't the (only) >> issue. >> > > An instruction like the one I propose would be safer than the normal > "disable all interrupts" instruction, because it places a limit on the > latency of interrupts. Code that simply disables interrupts could do so > for an arbitrary length of time - here it is specifically limited. > > It is harder to make this work well for your SMT cpu. Here you might > change things to say that for these next 4 clock cycles, the current > logical cpu runs on /every/ clock cycle - all other SMT threads are > paused. Depending on how you have organised things, that might be > simple or it might be nearly impossible. (On the XMOS, if only one > thread is running it gets a maximum of 1 cycle out of every 5, with the > rest wasted - this is due to the 5 stage pipeline of its cpu and that > each logical cpu can only have one instruction in action at a time.)
The other CPUs can be halted, (hard perhaps, but not impossible) but the point of the multi-CPU idea is to utilize a pipeline to make the clock faster, but rather than make it a single pipelined CPU with all it's warts, make it N CPUs. No one CPU can hog all the clock cycles because of the pipeline, no different from the XMOS design. But just as important is to not impact interrupt latency. Preventing execution for any other CPU will impact interrupt latency adversely which is a primary design goal. This is intended for hard, real time use. The sort of thing where a CPU may well be counting cycles for short delays or need to respond to an even on the next cycle. The instruction architecture of the last CPU I built allowed literally 1 clock interrupt latency as it only took one cycle to push all needed info to the stacks. The clock speed won't increase linearly with the pipeline length, but otherwise the cost of adding CPUs up to 16 is trivial. So I have thought of using some of the CPUs for interrupt handling. Can't get much faster than zero cycles. :) -- Rick C