In comp.arch.embedded,
Stef <stef33d@yahooI-N-V-A-L-I-D.com.invalid> wrote:
> In comp.arch.embedded,
> Tim Wescott <tim@seemywebsite.really> wrote:
>> On Fri, 22 Nov 2013 11:20:36 +0100, Stef wrote:
>>> 
>>> Where you looking for cpu's with dual-precision FP? Did you find any
>>> smallish ones? I'm looking for such a processor myself and it seems only
>>> the bigger CPU's have DP FP. Seems to start with Cortex A5 and this
>>> e300.
>>> Processors like M4 and SHARC only have single precision FPU's.
>>
>> The MPC5200 seems to be the most embeddable one that I've found.
>>
>> DP floating point hardware is just big; that probably drives the "only 
>> big processors have DP floating point" issue.
>
> So far I have similar findings, thanks for the confirmation.

Just had a better look at the Cortex-Rx, I somehow skipped over this one
earlier. The R4 always has a double precision FPU, the R5 and R7 have an
option for double precision or optimized single precision.

TI has for instance the Hercules safety controllers with R4 core,
starting at LQFP-144 packages. Interesting, not sure it can handle the
memory requirements for our application (which are still under
investigation btw).

The safety aspect of these controlers may be a plus for the intended
application.

-- 
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail)

There comes a time in the affairs of a man when he has to take the bull
by the tail and face the situation.
		-- W.C. Fields

In comp.arch.embedded,
Tim Wescott <tim@seemywebsite.really> wrote:
> On Fri, 22 Nov 2013 11:20:36 +0100, Stef wrote:
>> 
>> Where you looking for cpu's with dual-precision FP? Did you find any
>> smallish ones? I'm looking for such a processor myself and it seems only
>> the bigger CPU's have DP FP. Seems to start with Cortex A5 and this
>> e300.
>> Processors like M4 and SHARC only have single precision FPU's.
>
> The MPC5200 seems to be the most embeddable one that I've found.
>
> DP floating point hardware is just big; that probably drives the "only 
> big processors have DP floating point" issue.

So far I have similar findings, thanks for the confirmation.

> What are you doing?  You may be able to solve your problem with fixed-
> point arithmetic and cleverness -- in my recent search, had I not talked 
> my way into a spot on a big processor I would have seriously considered 
> 64-bit fixed point, and there's a huge class of control system problems 
> that just can't be done with 32-bit floating point that work great with 
> 32-bit fixed point.

This is a measurement that requires (amongst other things) a non-linear
fitting procedure with the Levenberg&ndash;Marquardt algorithm on a few thousand
samples. We already found that disabling denormals does not significantly
affect the results (but it is noticeable) and gives a huge speed increase.

Going back to 32-bit floating point or 64/32-bit fixed point will require
testing on a lot of data and of course modifying the code (there are
also some math lib functions involved in the calculation). This is not off
the option list, but needs to be compared with 'safe' solutions (sticking
with double precision floating point).

-- 
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail)

He:	Let's end it all, bequeathin' our brains to science.
She:	What?!?  Science got enough trouble with their OWN brains.
		-- Walt Kelly

On Saturday, November 23, 2013 2:38:22 PM UTC-8, Ulf Samuelsson wrote:
> 2013-11-21 01:55, edward.ming.lee@gmail.com skrev:
> 
> >> It is not clear what kind of bandwidth you will get from the flash,
> >> but most flash memories will not run more than 20 MHz
> >> so running out of DRAM is typically faster.
> 
> > If i read it correctly, PIC32MZ requires 2 wait states at 200MHz.
> > So, program flash is probably running at around 70 to 80MHz.
> > BTW, flash instruction path is 128 bits with 16K cache.
> 
> I dont know how you come to that conclusion
> 200 / (1+2) = 66 MHz.

Good catch on my bad math.  Should have said 60 to 70 MHz.

> And as someone else pointed out, the Renesas parts can run the flash at 
> 100 MHz.

Or you can run it on sram at 200MHz+. 

On a 80MHz MX, i have done OC 120MHz.

On a 200MHz MZ, perhaps we can try 300MHz.

Be sure to be able to stop SRAM OC with boot option.  Otherwise, you might not be able to reprogram the chip.

2013-11-21 01:55, edward.ming.lee@gmail.com skrev:
>
>> It is not clear what kind of bandwidth you will get from the flash,
>> but most flash memories will not run more than 20 MHz
>> so running out of DRAM is typically faster.
>
> If i read it correctly, PIC32MZ requires 2 wait states at 200MHz.

> So, program flash is probably running at around 70 to 80MHz.
 > BTW, flash instruction path is 128 bits with 16K cache.
>
I dont know how you come to that conclusion
200 / (1+2) = 66 MHz.

And as someone else pointed out, the Renesas parts can run the flash at 
100 MHz.

BR
Ulf

Ulf Samuelsson <ulf@notvalid.emagii.com> wrote:

 >It is not clear what kind of bandwidth you will get from the flash,
 >but most flash memories will not run more than 20 MHz

Buy Renesas, they can do 100Mhz with 0Wait and if it is not fast
enought, use a SH7264 with 1MByte internal Ram.
Ups...there is now a new SH7268 with 2624KByte internal Ram!

 >In a real application, you are going to have problems if peripherals
 >have to be handled in interrupes, and not with DMA,
 >and the PIC32MZ only has 8 channels, which is not a lot.

The SH2A had 16 Register bank to switch very fast for every IRQ. :-p

Oh..and 64kByte Dualport Ram is cute, too.... 

Olaf

Bill Giovino <billgiovino@gmail.com> wrote:

 >By fastest MCU, I mean a general-purpose device - a microcontroller
 > (not an SoC) with Flash and SRAM and peripherals. 330MIPS at that
 > class is the best I've seen.

So take a look at this:

http://www.renesas.com/products/mpumcu/superh/sh7200/child/sh2a_features.jsp

360Misp at 200Mhz, and the Errata is short because SH2 is old.

Olaf

On 11/20/2013 9:42 PM, dp wrote:
> On Wednesday, November 20, 2013 6:55:34 PM UTC+2, Tim Wescott wrote:
>> On Wed, 20 Nov 2013 00:24:33 -0800, Paul Rubin wrote:
>>
>>> Bill Giovino<billgiovino@gmail.com>  writes:
>>>> http://microcontroller.com/news/Microchip_PIC32MZ.asp The Microchip
>>>> PIC32MZ runs at 330MIPS at 200MHz
>>>
>>> What does this mean about being the fastest MCU?  Why is it interesting,
>>> since there are SOC's running at 1ghz and faster, not to mention vector
>>> DSP's and that sort of thing?  Also, the PIC32MZ doesn't appear to have
>>> any floating point arithmetic, unlike the M4 which it seems to position
>>> itself against.  It would be more interesting if the PIC had IEEE double
>>> precision, since the ARM M4F only has single precision.
>>
>> IEEE double precision takes a lot more hardware to get really fast
>> operation.
>>
>> I sometimes wonder if there wouldn't be a way to implement double-
>> precision floating point in hardware that wouldn't take up more space
>> than 'fast' single-precision FP, but at the cost of a few clock ticks.
>> For a lot of algorithms, a double-precision calculation that happened 1/4
>> as fast as an in-hardware single-precision calculation would still be far
>> better than either taking the precision hit of 32-bit, or the speed hit
>> of software synthesized 64 bit.
>
> On the e300 core (and likely on others I am not so intimately familiar
> with) Freescale have the FPU doing 2 cycle FMUL, FMADD etc. on 64 bit
> operands and 1 cycle on 32 bit ones. Works fine, on 400 MHz core
> clock they talk about 800 MIPS which if not 100% practically usable
> does help, interleaving FPU with integer (and perhaps more importantly,
> load/store) instructions does work OK (I have managed a 2.2 cycle total
> within a 64 bit FIR loop, load/store included).
> Now how did they compromise die size vs. performance I have no idea,
> I am just a user of theirs.
>
> Dimiter

Some people like to dig into the guts of things and compare on features 
they can't measure.  I don't care a rat's rear how big the FPU is, I 
care about the measurable features, price, cost, power, package size. 
The die size may impact these, but so do many, many other parameters. 
So I worry about the results I can see, not the ones I can't.

-- 

Rick

On Fri, 22 Nov 2013 11:20:36 +0100, Stef wrote:

> In comp.arch.embedded,
> Tim Wescott <tim@seemywebsite.really> wrote:
>> On Wed, 20 Nov 2013 19:11:24 -0800, dp wrote:
>>
>>> On Thursday, November 21, 2013 4:53:52 AM UTC+2, Tim Wescott wrote:
>>>> On Wed, 20 Nov 2013 18:42:12 -0800, dp wrote:
>>>> > 
>>>> > On the e300 core (and likely on others I am not so intimately
>>>> > familiar with) Freescale have the FPU doing 2 cycle FMUL, FMADD
>>>> > etc. on 64 bit operands and 1 cycle on 32 bit ones. Works fine, on
>>>> > 400 MHz core clock they talk about 800 MIPS which if not 100%
>>>> > practically usable does help,
>>>> > interleaving FPU with integer (and perhaps more importantly,
>>>> > load/store) instructions does work OK (I have managed a 2.2 cycle
>>>> > total within a 64 bit FIR loop, load/store included).
>>>> > Now how did they compromise die size vs. performance I have no idea
>>>> > I am just a user of theirs.
>>>> 
>>>> What chips does one find that core in?
>>> 
>>> The one I am using is the MPC5200B (watchout for the old 5200, still
>>> available but much buggier etc.). I have also used it on the 8240 (too
>>> old to consider now); they also have the MPC5125 and the MPC5121 (I
>>> have just been eyeing these, never used one).
>>> 
>>> 
>> Just four days ago I searched through the documentation for that chip,
>> and I came to the conclusion that the FPU only supported
>> single-precision floating point in hardware.  Aside from having to go
>> tell a customer that I had my head buried in my assumptions, I guess I
>> should be pleased to be wrong.
> 
> Where you looking for cpu's with dual-precision FP? Did you find any
> smallish ones? I'm looking for such a processor myself and it seems only
> the bigger CPU's have DP FP. Seems to start with Cortex A5 and this
> e300.
> Processors like M4 and SHARC only have single precision FPU's.

The MPC5200 seems to be the most embeddable one that I've found.

DP floating point hardware is just big; that probably drives the "only 
big processors have DP floating point" issue.

What are you doing?  You may be able to solve your problem with fixed-
point arithmetic and cleverness -- in my recent search, had I not talked 
my way into a spot on a big processor I would have seriously considered 
64-bit fixed point, and there's a huge class of control system problems 
that just can't be done with 32-bit floating point that work great with 
32-bit fixed point.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

> > Wish is had floating point like stm32f4.
> 
> MIPS/ImgTec only announced their first microcontroller core with FP 
> support a few months ago (and I'd really rather not Microchip tried to 
> roll their own. If they did the errata sheet would probably mention 
> something about badgers mauling your face.)
> 

FP aside, i am more interested in the 12 bits A2D and 512K sram. Hopefully, they won't mess up these too badly.  At least they have 64 pins package.  Most M4 start with 100 pins.