EmbeddedRelated.com
Forums

Arm Cortex-M7 - still single precision FPU

Started by Dave Nadler September 24, 2014
On 2014-09-25 rickman wrote in comp.arch.embedded:
> On 9/24/2014 9:00 PM, Dave Nadler wrote: >> On Wednesday, September 24, 2014 7:36:33 PM UTC-4, Dave Nadler wrote: >>> On Wednesday, September 24, 2014 7:31:31 PM UTC-4, Dave Nadler wrote: >>> >>>> On Wednesday, September 24, 2014 5:43:51 PM UTC-4, dalai lamah wrote: >>>>> Actually Wikipedia claims that the M7 will have both SP and DP floating >>>>> point: >>>>> http://en.wikipedia.org/wiki/ARM_Cortex-M#Instruction_sets >>> >>> ST data-sheet says single-point... >>> Page 12 here: >>> http://www.st.com/st-web-ui/static/active/en/resource/technical/document/data_brief/DM00116941.pdf >> >> However, ARM says it has double precision here: >> http://arm.com/products/processors/cortex-m/cortex-m7-processor.php > > The ARM info is marketing... notice all that they *don't* say. I > believe even on the M7 floating point is optional. I'd believe the data > sheet. > > Anyone looked at the R series? I searched about a bit and didn't find > anything with an FPU. TI and Spansion are the only ones I found making > them.
I think all TI R5 have an FPU, at least these do: http://www.ti.com/product/TMS570LC4357/datasheet http://www.ti.com/lit/ds/symlink/rm57l843.pdf If you really need a fast DP FPU, the Renesas RZ may be an option. The onboard RAM sounded really appealing, but unfortunately production was not in time for our product so we switched to Xilinx Zynq. Dual core A9 with fast DP FPU and an FPGA along with it. Requires external DDR3 however so we opted to used a MicroZed module, at least for the first series. -- Stef (remove caps, dashes and .invalid from e-mail address to reply by mail) Sooner or later you must pay for your sins. (Those who have already paid may disregard this cookie).
Op Thu, 25 Sep 2014 03:39:36 +0200 schreef rickman <gnuarm@gmail.com>:
> Anyone looked at the R series? I searched about a bit and didn't find > anything with an FPU. TI and Spansion are the only ones I found making > them.
About half of the available TMS570 devices has FPU. -- (Remove the obvious prefix to reply privately.) Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/
On 24/09/2014 22:19, Dave Nadler wrote:
> Looks like ARM has not moved to double-precision yet with M7. > ST announced M7 parts that look really impressive except for this issue, > and claims to have preview parts available. > Wonder why still single precision?? > Hmmm... >
I believe that, like a few things on the Cortex-M family, it is optional and up to the actual manufacturer to decide whether to include single or double (and maybe even none at all as per one of the Cortex-M4 options). ...of course, FreeRTOS already supports the M7 ;o) Regards, Richard. + http://www.FreeRTOS.org Designed for microcontrollers. More than 107000 downloads in 2013. + http://www.FreeRTOS.org/plus IoT, Trace, Certification, FAT FS, TCP/IP, Training, and more...
On 9/25/2014 4:46 AM, Stef wrote:
> On 2014-09-25 rickman wrote in comp.arch.embedded: >> On 9/24/2014 9:00 PM, Dave Nadler wrote: >>> On Wednesday, September 24, 2014 7:36:33 PM UTC-4, Dave Nadler wrote: >>>> On Wednesday, September 24, 2014 7:31:31 PM UTC-4, Dave Nadler wrote: >>>> >>>>> On Wednesday, September 24, 2014 5:43:51 PM UTC-4, dalai lamah wrote: >>>>>> Actually Wikipedia claims that the M7 will have both SP and DP floating >>>>>> point: >>>>>> http://en.wikipedia.org/wiki/ARM_Cortex-M#Instruction_sets >>>> >>>> ST data-sheet says single-point... >>>> Page 12 here: >>>> http://www.st.com/st-web-ui/static/active/en/resource/technical/document/data_brief/DM00116941.pdf >>> >>> However, ARM says it has double precision here: >>> http://arm.com/products/processors/cortex-m/cortex-m7-processor.php >> >> The ARM info is marketing... notice all that they *don't* say. I >> believe even on the M7 floating point is optional. I'd believe the data >> sheet. >> >> Anyone looked at the R series? I searched about a bit and didn't find >> anything with an FPU. TI and Spansion are the only ones I found making >> them. > > I think all TI R5 have an FPU, at least these do: > http://www.ti.com/product/TMS570LC4357/datasheet > http://www.ti.com/lit/ds/symlink/rm57l843.pdf
And you win the jackpot! TI doesn't make it easy to find what these parts do without downloading the data sheets. Either many of their R4/R5 parts don't have FPU or they are keeping it a secret. They have changed their web site over the last year or so and it seems to be much more marketing and less info. I did not see one selection guide that included this info. I get the impression they are making significant changes in this product line and much of the info is out of date. Also, this series which is called "Hercules" under the "Safety" line seems to be all about the dual CPUs which I think are intended to run duplicate code as a redundant backup. They talk about running them in "lockstep" which means you can use logic to tell if they differ which would indicate a failure.
> If you really need a fast DP FPU, the Renesas RZ may be an option. The > onboard RAM sounded really appealing, but unfortunately production was not > in time for our product so we switched to Xilinx Zynq. Dual core A9 with > fast DP FPU and an FPGA along with it. Requires external DDR3 however so > we opted to used a MicroZed module, at least for the first series.
So far I haven't seen any speed requirements stated, just a request for double precision. -- Rick
On Thu, 25 Sep 2014 05:12:33 -0400, rickman <gnuarm@gmail.com> wrote:

>Also, this series which is called "Hercules" under the "Safety" line >seems to be all about the dual CPUs which I think are intended to run >duplicate code as a redundant backup. They talk about running them in >"lockstep" which means you can use logic to tell if they differ which >would indicate a failure.
Only two CPUs ? If they disagree, how does the logic tell, which gives the correct result and which doesn't ? You would need at least three CPUs or any voting systems. Does these even have separate power supply pins for each CPU so that you could run them from two separate power supplies ?
On 9/25/2014 10:42 AM, upsidedown@downunder.com wrote:
> On Thu, 25 Sep 2014 05:12:33 -0400, rickman <gnuarm@gmail.com> wrote: > >> Also, this series which is called "Hercules" under the "Safety" line >> seems to be all about the dual CPUs which I think are intended to run >> duplicate code as a redundant backup. They talk about running them in >> "lockstep" which means you can use logic to tell if they differ which >> would indicate a failure. > > Only two CPUs ? > > If they disagree, how does the logic tell, which gives the correct > result and which doesn't ? > > You would need at least three CPUs or any voting systems.
Error detection, not correction.
> Does these even have separate power supply pins for each CPU so that > you could run them from two separate power supplies ?
Protection from power failure is up to the rest of the design. It's for safety where you want things to not do damage I think. Not sure that it is required to continue working. But I'm not sure, just interpreting what I see. -- Rick
On Thu, 25 Sep 2014 02:10:30 +0000, glen herrmannsfeldt wrote:

> Tim Wescott <seemywebsite@myfooter.really> wrote: >> On Wed, 24 Sep 2014 14:19:50 -0700, Dave Nadler wrote: > >>> Looks like ARM has not moved to double-precision yet with M7. >>> ST announced M7 parts that look really impressive except for this >>> issue, and claims to have preview parts available. Wonder why still >>> single precision?? >>> Hmmm... > >> I suspect that the size of the FPU goes up as bits^2 or bits^3 or >> something obnoxious like that. > > A full Wallace tree (combinatorial) multiplier is bits^2, but that is > rare. > > A fully pipelined mulitplier is also bits^2, but can produce a new > product every clock cycle, ones the pipeline is full. > (Nice for vector processors.) Less than fully pipelined produces a > product every N cycles with P(bits^2/N) logic. > > Usual dividers are O(bits) space and O(bits) time. > > Newton-Raphson dividers use the pipelined multplier, and produce a > quotient in a small multiple of the number of cycles to run the > multiplier. > > For specific examples, the IBM 360/91 and Cray-1 are favorites in books > on pipelined processors. > > http://product.half.ebay.com/The-Architecture-of-Pipelined-Computers-by-
Peter-M-Kogge-1981-Hardcover/1202954
> > -- glen > > -- glen
I should have qualified my statement somehow, to indicate "for the same speed" or whatnot. I read someplace that most of the area of an IEEE-compliant hardware FPU (and most of the lines of code for a similar software FP library) are involved in error trapping and exception handling. I dunno if the complexity of that goes as bits^1, bits^2, or bits^gawdaful. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
Op Thu, 25 Sep 2014 17:22:07 +0200 schreef rickman <gnuarm@gmail.com>:
> On 9/25/2014 10:42 AM, upsidedown@downunder.com wrote: >> On Thu, 25 Sep 2014 05:12:33 -0400, rickman <gnuarm@gmail.com> wrote: >> >>> Also, this series which is called "Hercules" under the "Safety" line >>> seems to be all about the dual CPUs which I think are intended to run >>> duplicate code as a redundant backup. They talk about running them in >>> "lockstep" which means you can use logic to tell if they differ which >>> would indicate a failure. >> >> Only two CPUs ? >> >> If they disagree, how does the logic tell, which gives the correct >> result and which doesn't ? >> >> You would need at least three CPUs or any voting systems. > > Error detection, not correction. > > >> Does these even have separate power supply pins for each CPU so that >> you could run them from two separate power supplies ? > > Protection from power failure is up to the rest of the design. It's for > safety where you want things to not do damage I think.
This kind of safety most often comes down to keeping a system in a state that nobody dies. If the lock=step processor fails, this generally signals a more low-level circuit to bring the system to a non-operational, but still safe state. -- (Remove the obvious prefix to reply privately.) Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/
On 26/09/14 09:26, Boudewijn Dijkstra wrote:
> Op Thu, 25 Sep 2014 17:22:07 +0200 schreef rickman <gnuarm@gmail.com>: >> On 9/25/2014 10:42 AM, upsidedown@downunder.com wrote: >>> On Thu, 25 Sep 2014 05:12:33 -0400, rickman <gnuarm@gmail.com> wrote: >>> >>>> Also, this series which is called "Hercules" under the "Safety" line >>>> seems to be all about the dual CPUs which I think are intended to run >>>> duplicate code as a redundant backup. They talk about running them in >>>> "lockstep" which means you can use logic to tell if they differ which >>>> would indicate a failure. >>> >>> Only two CPUs ? >>> >>> If they disagree, how does the logic tell, which gives the correct >>> result and which doesn't ? >>> >>> You would need at least three CPUs or any voting systems.
True error correction is a lot harder than just "majority voting" of three cpus.
>> >> Error detection, not correction.
Correct. I haven't read the details of this chip, but there can also be other detection mechanisms such as ECC on buses that are used to spot that which cpu is having trouble. If the hardware can identify which cpu has failed, then the other one can continue in a "limp" mode.
>> >>> Does these even have separate power supply pins for each CPU so that >>> you could run them from two separate power supplies ? >> >> Protection from power failure is up to the rest of the design. It's >> for safety where you want things to not do damage I think. > > This kind of safety most often comes down to keeping a system in a state > that nobody dies. If the lock=step processor fails, this generally > signals a more low-level circuit to bring the system to a > non-operational, but still safe state. >
Indeed. There are many systems where it is sufficient to stop everything if there is a critical failure. There are also many that can have a simple safe mode (such as a car - if a failure is detected, you keep the brakes and steering going but bring the engine to a controlled stop). A chip like this can detect chip failures and then perhaps run in a limp mode, such as at lower speed or with cache disabled (cache errors are a substantial part of single-event upsets). I heard somewhere a little about the physical layout of chips like the Hercules (though I may well be mixing this up with similar lock-step chips from Freescale's MPC range) - they do things like lay out the two cpus at 90 degrees and upside down, so that electrical interference will affect the two cpus differently. Sometimes the second cpu layout is done entirely separately by a different group from the first layout. If you need full operation after critical failure, you need a more complex system. I believe that in the aircraft industry, they use majority voting from three processor boards - but each board has a different type of processor, running software from different development teams, so that systematic errors in one design will not affect the others.
On Wednesday, September 24, 2014 9:00:00 PM UTC-4, Dave Nadler wrote:
> On Wednesday, September 24, 2014 7:36:33 PM UTC-4, Dave Nadler wrote: > > On Wednesday, September 24, 2014 7:31:31 PM UTC-4, Dave Nadler wrote: > > ST data-sheet says single-point... > > Page 12 here: > > http://www.st.com/st-web-ui/static/active/en/resource/technical/document/data_brief/DM00116941.pdf > > However, ARM says it has double precision here: > http://arm.com/products/processors/cortex-m/cortex-m7-processor.php
Follow-up: ST clarifies that their M7 is indeed SINGLE precision FP. ST claim (conversation this AM with local ST specialist): - ARM couldn't deliver IP for double - competitors Freescale and Atmel will also only deliver single in short-term - expect double FP ST parts in ~one year Best Regards, Dave