Reply by Jim Granville September 11, 20112011-09-11
On Sep 12, 1:35=A0am, Jon Kirwan <j...@infinitefactors.org> wrote:

> But nothing under $2, or close, showed up on the first sorted page.
Well, yes, those numbers are Market-droid claims, (but they are in the 1K column, not the usual 6 digit claims...) ADI here says $1.99/1K and $2.99/100 http://www.analog.com/en/processors-dsp/blackfin/adsp-bf592/processors/prod= uct.html for a 200MHz/400MMAC part, with good 32 bit timers
> But a closer reading of your comment might be that you mean > to talk about cases where the processor itself is built on a > fast, clocked process, let's say 400MHz given worst case > pathways and pipelining limits (M4k at 90nm.) =A0But that the > flash and cache (and, I suppose, also the necessity these > days for marketing purposes that a microcontroller include > bullet-proof, class-A crystal drivers rather than specify > some more complex high-speed design) may limit the useful > speed to something less, say 80MHz. =A0(Though I've read > someone to say they've clocked the PIC32 at 120MHz, just had > to set wait states for the flash.) =A0That in this case, say, a > sampling ADC of the SAR style might be clocked still at > 400MHz to process a captured sample at whatever resolution > without regard to the CPU clock rate? =A0Is that it?
Close enough; There may be a 80MHz imposed by flash, or I might want to clock the CPU at 20MHz for power-budget reasons. It is silly that I am then imposed a ceiling of 50ns for peripherals. Some parts (not many) DO allow faster peripheral clocks than the core. eg I've seen recent claims of 256Mhz timers on XMega, and some MSP430s and some parts use delay line calibration schemes to get to 1ns or below, on PWM edge resolution, and a subset of those also give higher resolution capture.. I think I've seen a couple that use a high frequency clock for UARTS, as those can nudge over 20MHz these days. Then there are nice parts like the Silabs Si5351A, (close to $1), which can output (almost) any 3 CLK's up to 160MHz, using fractional synthesisers, and a 600-900MHz VCO+Xtal. These can help solve some clock-conflicts. -jg
Reply by Jon Kirwan September 11, 20112011-09-11
On Sat, 10 Sep 2011 16:50:32 -0700 (PDT), Jim Granville
<j.m.granville@gmail.com> wrote:

>On Sep 9, 11:49&#4294967295;am, Jon Kirwan <j...@infinitefactors.org> wrote: >> >> Interestingly, although the PIC32 achieves a fair pace (up to >> 80MHz), it isn't up to the MIPS synthesizable core, claimed >> to be over 400MHz capable at 90nm and over 200Mhz capable at >> 130nm. > >That's more a flash artifact, than a process one.
I'm aware. A wide flash bus is often used these days to compensate (with a little bit of ram to hold at least one line of it.) There's some commentary about this regarding the PIC32 in the architecture overview. It's also true for the MSP430's new FRAM device, as well. A wide FRAM bus is used with a little bit of cache ram to help reads. And on any day, FRAM writes are much faster than flash writes.
>Flash based uC seem to be rather stuck, for the last half decade, in >speed at the 80/100/120MHz region, and only the RAM based ones get >into the hundreds of MHz (like the sub $2 DSPs I mentioned above)
I didn't know which ones from TI and ADI you were referring to. I had done a quick check on Digikey, selected DSPs and selected ADI and TI and then sorted on price and that they actually have one or two in stock. But nothing under $2, or close, showed up on the first sorted page.
>Even if the Flash limits the CPU speed, one of my peeves, is very few >parts allow the peripherals to run to the silicon process speed, >instead forcing the peripheral clock to be <= CPU clock. >-jg
Flash read cycle times may be slow (and writes so slow it needn't be mentioned), but anything external to the device will also be slower than the cpu core can achieve. Within the chip, outputs have known loads and the transmission gates and inverters can be sized exactly for that known situation. Anything leaving the chip must go through oversized drivers which drive unknown loads and must therefore be sized for the worst design case. And that must also include the wire bonds, chip carrier, and leads as well as whatever an end use might add to that. Trace widths must be wide enough to handle, at least in microcontrollers whose pins often these days must handle tens of milliamps, to sustain those currents and survive metal migration for some given lifetime, as well. I have a hard time imagining that unknown external loading drivers/inputs can ever (on the same die) achieve speeds that internal signals can, with known-in-advance loads used to size them. But a closer reading of your comment might be that you mean to talk about cases where the processor itself is built on a fast, clocked process, let's say 400MHz given worst case pathways and pipelining limits (M4k at 90nm.) But that the flash and cache (and, I suppose, also the necessity these days for marketing purposes that a microcontroller include bullet-proof, class-A crystal drivers rather than specify some more complex high-speed design) may limit the useful speed to something less, say 80MHz. (Though I've read someone to say they've clocked the PIC32 at 120MHz, just had to set wait states for the flash.) That in this case, say, a sampling ADC of the SAR style might be clocked still at 400MHz to process a captured sample at whatever resolution without regard to the CPU clock rate? Is that it? Jon
Reply by Jim Granville September 11, 20112011-09-11
On Sep 11, 3:32=A0pm, Mark Borgerson <mborger...@comcast.net> wrote:
> In article <c831ff5d-0f43-4707-9788- > efc01c5e1...@n19g2000prh.googlegroups.com>, j.m.granvi...@gmail.com > says... > > > > > > > On Sep 9, 11:49 am, Jon Kirwan <j...@infinitefactors.org> wrote: > > > > Interestingly, although the PIC32 achieves a fair pace (up to > > > 80MHz), it isn't up to the MIPS synthesizable core, claimed > > > to be over 400MHz capable at 90nm and over 200Mhz capable at > > > 130nm. > > > That's more a flash artifact, than a process one. > > > Flash based uC seem to be rather stuck, for the last half decade, in > > speed at the 80/100/120MHz region, and only the RAM based ones get > > into the hundreds of MHz (like the sub $2 DSPs I mentioned above) > > > =A0Even if the Flash limits the CPU speed, one of my peeves, is very fe=
w
> > parts allow the peripherals to run to the silicon process speed, > > instead forcing the peripheral clock to be <=3D CPU clock. > > Peripherals, having to produce and accept signals from the outside > world using traces with larger capacitance and inductance, will > naturally be more limited in speed.
Perhaps, but that ceiling is rather above the Flash-Speed limit I was mentioning, and it clearly is not too much of an actual problem, as SOME uC vendors can manage Peripheral clocks faster than core speeds. It is a slow trend, I'd like to see become faster... -jg
Reply by Mark Borgerson September 11, 20112011-09-11
In article <c831ff5d-0f43-4707-9788-
efc01c5e13ce@n19g2000prh.googlegroups.com>, j.m.granville@gmail.com 
says...
> > On Sep 9, 11:49&#4294967295;am, Jon Kirwan <j...@infinitefactors.org> wrote: > > > > Interestingly, although the PIC32 achieves a fair pace (up to > > 80MHz), it isn't up to the MIPS synthesizable core, claimed > > to be over 400MHz capable at 90nm and over 200Mhz capable at > > 130nm. > > That's more a flash artifact, than a process one. > > Flash based uC seem to be rather stuck, for the last half decade, in > speed at the 80/100/120MHz region, and only the RAM based ones get > into the hundreds of MHz (like the sub $2 DSPs I mentioned above) > > Even if the Flash limits the CPU speed, one of my peeves, is very few > parts allow the peripherals to run to the silicon process speed, > instead forcing the peripheral clock to be <= CPU clock.
Peripherals, having to produce and accept signals from the outside world using traces with larger capacitance and inductance, will naturally be more limited in speed. Mark Borgerson
Reply by Jim Granville September 10, 20112011-09-10
On Sep 9, 11:49=A0am, Jon Kirwan <j...@infinitefactors.org> wrote:
> > Interestingly, although the PIC32 achieves a fair pace (up to > 80MHz), it isn't up to the MIPS synthesizable core, claimed > to be over 400MHz capable at 90nm and over 200Mhz capable at > 130nm.
That's more a flash artifact, than a process one. Flash based uC seem to be rather stuck, for the last half decade, in speed at the 80/100/120MHz region, and only the RAM based ones get into the hundreds of MHz (like the sub $2 DSPs I mentioned above) Even if the Flash limits the CPU speed, one of my peeves, is very few parts allow the peripherals to run to the silicon process speed, instead forcing the peripheral clock to be <=3D CPU clock. -jg
Reply by Jon Kirwan September 9, 20112011-09-09
On Sat, 10 Sep 2011 00:55:21 +0000 (UTC),
Anders.Montonen@kapsi.spam.stop.fi.invalid wrote:

>Jon Kirwan <jonk@infinitefactors.org> wrote: > >><snip> >> Thanks for the book suggestion. It supplements what I >> already know about the old R2000 TLBs, too. This is bringing >> back lots of memories, now. > >I think it's a good computer architecture book, with a more practical bent >than the usual textbooks. I actually like the 2002 first edition better, >but the second one is definitely more relevant to modern CPUs.
I'll look for the earlier one, as well, then. Knowledge like this only very gradually fades away, if at all. I also like paper, especially for something like this where I just lay out on the floor to read, so I will likely see about getting genuine editions. Jon
Reply by September 9, 20112011-09-09
Jon Kirwan <jonk@infinitefactors.org> wrote:

> Microchip and the PIC32 aren't even mentioned. The R4000 is, but the > M4k is mentioned just once near the top in a long list of names. On > page 38 they say that the multiply takes 4-12 clocks, so I know we are > on "different pages" already. It's an old book, by now.
It is more concerned with the architecture (ie. MIPS32/64) than any particular implementation. The first edition touched more on different cores as the system level architecture wasn't standardised yet, but that's at least supposed to be fixed by now.
> By the way, it says at the bottom on page 38, "Integer > multiply and divide operations never produce an exception; > not even divide by zero..." Maybe I can take it that the > PIC32 follows this guideline?
Well, the MIPS32 architecture manual states that they don't, so I would assume so.
> Which suggests the answer to a question I'd posed earlier as > a possible difference between the Cortex-M3 and the M4k -- > that the Cortex-M3 stalls until complete but the M4k does NOT > stall. But it also suggests that even in the face of an > interrupt event, the division continues and doesn't cause a > stall unless there is a need for that in the new instruction > stream.
That was my conclusion as well.
> Thanks for the book suggestion. It supplements what I > already know about the old R2000 TLBs, too. This is bringing > back lots of memories, now.
I think it's a good computer architecture book, with a more practical bent than the usual textbooks. I actually like the 2002 first edition better, but the second one is definitely more relevant to modern CPUs. -a
Reply by Jon Kirwan September 9, 20112011-09-09
On Fri, 9 Sep 2011 12:55:47 +0000 (UTC),
Anders.Montonen@kapsi.spam.stop.fi.invalid wrote:

>Anders.Montonen@kapsi.spam.stop.fi.invalid wrote: >> I couldn't find any information in either the PIC32 docs or the MIPS >> architecture docs, but I'm leaning towards the MDU ignoring interrupts >> altogether. If the ISR uses the MDU, then the pipeline will stall until >> the previous instruction completes. > >Replying to myself, but this is apparently how the MDU worked in >pre-MIPS32/64 days, nowadays it's a bit smarter. See pp. 108-109 in See >MIPS Run 2nd ed. (Seems to be easy enough to find naughty PDF versions if >you don't own a paper copy.) > >-a
Well, it wasn't all that easy to find. Lots of Chinese language versions floating about. But my chinese vocabulary is about 30 characters or so. So I'm very limited. I can count, draw "man" "big" and "too much", and things like "mouth" "door" and "window." Then I start running out. I did find this as the only 'good' version easily findable: http://people.openrays.org/~comcat/godson/doc/See.MIPS.Run.2nd.en.pdf 2007, apparently, for the book. Some years, now. I read the relevant material. It mostly addresses older designs. For example, when it brings up the newer MIPS32 architecture at the top of page 109, it says "the instructions behave themselves," which is not entirely descriptive for me. Then it goes on to talk about "older CPUs" for the entire rest of that paragraph, the next one, and the next one before going on to another section. It never does talk, in detail, about the very architecture I want to know about. Microchip and the PIC32 aren't even mentioned. The R4000 is, but the M4k is mentioned just once near the top in a long list of names. On page 38 they say that the multiply takes 4-12 clocks, so I know we are on "different pages" already. It's an old book, by now. But I love it, too!!! Thanks. By the way, it says at the bottom on page 38, "Integer multiply and divide operations never produce an exception; not even divide by zero..." Maybe I can take it that the PIC32 follows this guideline? Also, I read from this book that pipeline exceptions are delayed (allowed to flush through the pipeline) before being acted upon. But since by the earlier note a DIV cannot cause a divide-by-zero error, they never generate an exception. In fact, since the MDU is "autonomous" it probably would be very hard for it to "insert" any kind of exception into that pipeline flow, anyway. So that makes sense, too. Elsewhere, they do talk about the floating point coprocessing taking multiple clocks. I can probably use their handling of that, by analogy, to help me understand what would happen in the MDU case. here, it says on page 52 that "FP computations are allowed to proceed in parallel" with the execution of later instructions, and the CPU is stalled if an instrction reads a result register before the computation finishes." Which suggests the answer to a question I'd posed earlier as a possible difference between the Cortex-M3 and the M4k -- that the Cortex-M3 stalls until complete but the M4k does NOT stall. But it also suggests that even in the face of an interrupt event, the division continues and doesn't cause a stall unless there is a need for that in the new instruction stream. Thanks for the book suggestion. It supplements what I already know about the old R2000 TLBs, too. This is bringing back lots of memories, now. Jon
Reply by dp September 9, 20112011-09-09
On Sep 9, 7:29=A0am, Jon Kirwan <j...@infinitefactors.org> wrote:
> .... > Interesting you bring up DMA. =A0I was using the SiLabs part to > gain access to an internal, 1MHz, 16 bit SAR ADC (which, if > external, would have cost me as much as the chip or more) and > it required DMA -- (given that this is an 8051 style cpu, > that won't be surprising to anyone.) =A0Turns out, the > documentation on the DMA section is poor and if you really > want to know exactly what you are doing when you copy someone > else's supposedly working code then the doc is not adequate > to the task. =A0So I sent off my questions, worked with the > local disty, they pressed, I pressed, we all pressed. =A0But > SiLabs (US, anyway) didn't know. =A0Thing is, they had to track > down the DMA section designer who apparently now is off- > shore; I think in Singapore or something. =A0Took them months. > He'd designed it 8 years before that time. =A0But I got my > answer, at least. =A0Took maybe three months to do?
I was much luckier with the SDMA on the 5200B. Got in touch with the guy who had architected it, he was as helpful as he could possibly be, sent me some more data in addition to what was to be found on the web. Took me a few days to grasp it all and begin using it, but once I did it was really useful. The final thing I made for it was the Ethernet engine, the on-board Ethernet controller is only FIFO-ed both ways, the rest must be done by the SDMA. Took me a only day or so to implement the queue of buffers with pointers to packets etc. (obviously it took me a lot longer to integrate Ethernet completely, but this was CPU code).
>=A0I am enjoying this. > > Jon
Me too. Dimiter
Reply by September 9, 20112011-09-09
Anders.Montonen@kapsi.spam.stop.fi.invalid wrote:
> I couldn't find any information in either the PIC32 docs or the MIPS > architecture docs, but I'm leaning towards the MDU ignoring interrupts > altogether. If the ISR uses the MDU, then the pipeline will stall until > the previous instruction completes.
Replying to myself, but this is apparently how the MDU worked in pre-MIPS32/64 days, nowadays it's a bit smarter. See pp. 108-109 in See MIPS Run 2nd ed. (Seems to be easy enough to find naughty PDF versions if you don't own a paper copy.) -a