Forums

STM32 ARM toolset advice?

Started by John Speth October 7, 2008
In article <bWQGk.4374$wG3.1249@newsfe23.ams2>, 
Wilco.removethisDijkstra@ntlworld.com says...
> > "Mark Borgerson" <mborgerson@comcast.net> wrote in message news:MPG.2355722479c1065e989919@newsgroups.comcast.net... > > In article <81ene4hsatqaphkmp01cikmpk9l7ana9qi@4ax.com>, > > nobody@spam.prevent.net says... > > >> Keil has been bought by ARM, and AFAIK they now use the compiler from > >> ARM. Apparently this compiler generates the best code for ARM's CPUs. > >> GCC Generates quite good code for the ARM these days. The biggest > >> drawback is the use of newlib. Rowley provides a nice IDE with GCC, > >> and their own library, which removes the one disadvantage of using > >> GCC. Their product is also available for Windows and Linux. > >> > >> > > One of the biggest problems I ran into with GCC-ARM when using the > > linux libraries for the TRITON boards, is that floating point operations > > are executed as kernel interrupts (Undefined Instruction generating > > jumps to a floating point emulator, I think.) I think it turned > > out to be several times slower than the IAR floating point library > > that runs in user mode. > > And given that the IAR floating point library is one of the slowest available, > that is quite slow indeed. >
I found t that an ARM (PXA-255 at 400MHz like that in the Triton board) does a FP multiply in about 0.13 microseconds using the Soft-Float libary. (It takes about 10X longer using floating-point emulation in the Linux kernel.) http://albatross-uav.org/index.php/Benchmarks I did find that an AT91SAM7S at 16Mhz using the IAR libraries, had about 5 times the floating point performance of an M68332 at the same clock using the Codewarrior libraries. How much of that is better code and how much is due to a better CPU is open to question. Mark Borgerson
"Mark Borgerson" <mborgerson@comcast.net> wrote in message news:MPG.2355df51ee57f03298991d@newsgroups.comcast.net...
> In article <bWQGk.4374$wG3.1249@newsfe23.ams2>, > Wilco.removethisDijkstra@ntlworld.com says... >> >> "Mark Borgerson" <mborgerson@comcast.net> wrote in message news:MPG.2355722479c1065e989919@newsgroups.comcast.net... >> > In article <81ene4hsatqaphkmp01cikmpk9l7ana9qi@4ax.com>, >> > nobody@spam.prevent.net says... >> >> >> Keil has been bought by ARM, and AFAIK they now use the compiler from >> >> ARM. Apparently this compiler generates the best code for ARM's CPUs. >> >> GCC Generates quite good code for the ARM these days. The biggest >> >> drawback is the use of newlib. Rowley provides a nice IDE with GCC, >> >> and their own library, which removes the one disadvantage of using >> >> GCC. Their product is also available for Windows and Linux. >> >> >> >> >> > One of the biggest problems I ran into with GCC-ARM when using the >> > linux libraries for the TRITON boards, is that floating point operations >> > are executed as kernel interrupts (Undefined Instruction generating >> > jumps to a floating point emulator, I think.) I think it turned >> > out to be several times slower than the IAR floating point library >> > that runs in user mode. >> >> And given that the IAR floating point library is one of the slowest available, >> that is quite slow indeed. >> > > I found t that an ARM (PXA-255 at 400MHz like that in the Triton board) > does a FP multiply in about 0.13 microseconds using the Soft-Float > libary. (It takes about 10X longer using floating-point emulation in > the Linux kernel.)
That's about 52 cycles, which is pretty good if it is double multiply.
> http://albatross-uav.org/index.php/Benchmarks > > I did find that an AT91SAM7S at 16Mhz using the IAR libraries, > had about 5 times the floating point performance of an M68332 at the > same clock using the Codewarrior libraries. How much of that is > better code and how much is due to a better CPU is open to > question.
The M68332 is such a slow CISC that one can do a full floating point multiply on ARM in less than half the time it takes the M68332 to execute one 32x32 multiply instruction. Wilco
In article <bM_Gk.19032$wU.1096@newsfe11.ams2>, 
Wilco.removethisDijkstra@ntlworld.com says...
> > "Mark Borgerson" <mborgerson@comcast.net> wrote in message news:MPG.2355df51ee57f03298991d@newsgroups.comcast.net... > > In article <bWQGk.4374$wG3.1249@newsfe23.ams2>, > > Wilco.removethisDijkstra@ntlworld.com says... > >> > >> "Mark Borgerson" <mborgerson@comcast.net> wrote in message news:MPG.2355722479c1065e989919@newsgroups.comcast.net... > >> > In article <81ene4hsatqaphkmp01cikmpk9l7ana9qi@4ax.com>, > >> > nobody@spam.prevent.net says... > >> > >> >> Keil has been bought by ARM, and AFAIK they now use the compiler from > >> >> ARM. Apparently this compiler generates the best code for ARM's CPUs. > >> >> GCC Generates quite good code for the ARM these days. The biggest > >> >> drawback is the use of newlib. Rowley provides a nice IDE with GCC, > >> >> and their own library, which removes the one disadvantage of using > >> >> GCC. Their product is also available for Windows and Linux. > >> >> > >> >> > >> > One of the biggest problems I ran into with GCC-ARM when using the > >> > linux libraries for the TRITON boards, is that floating point operations > >> > are executed as kernel interrupts (Undefined Instruction generating > >> > jumps to a floating point emulator, I think.) I think it turned > >> > out to be several times slower than the IAR floating point library > >> > that runs in user mode. > >> > >> And given that the IAR floating point library is one of the slowest available, > >> that is quite slow indeed. > >> > > > > I found t that an ARM (PXA-255 at 400MHz like that in the Triton board) > > does a FP multiply in about 0.13 microseconds using the Soft-Float > > libary. (It takes about 10X longer using floating-point emulation in > > the Linux kernel.) > > That's about 52 cycles, which is pretty good if it is double multiply.
I think it was only single-precision (32 bits) IEEE-854 format. It is also possible that the soft float library did not properly handle NAN and some other conditions. It was supposedly highly optimized for the array-multiply operations for which it was used (part of an extended Kalman Filter).
> > > http://albatross-uav.org/index.php/Benchmarks > > > > I did find that an AT91SAM7S at 16Mhz using the IAR libraries, > > had about 5 times the floating point performance of an M68332 at the > > same clock using the Codewarrior libraries. How much of that is > > better code and how much is due to a better CPU is open to > > question. > > The M68332 is such a slow CISC that one can do a full floating point > multiply on ARM in less than half the time it takes the M68332 to execute > one 32x32 multiply instruction. >
I agree. I used to have to worry about the effects of DIV and DIVU instructions on interrupt latency in some low-jitter analog input routines. I suspect that there is some real art in the design and coding of an ARM FP library so that the operations take full advantage of the shift/rotate and evaluate instructions and scheduling things to keep the pipelines full. I don't think the earlier IAR libraries took advantage of the instruction set as much as they could have as the code is written in C (by P.J. Plauger in 1994, according to the available library source). A lot of the math.h routines also expand 32-bit floats to doubles before doing the math. That's a good way to maintain precision, but not the fastest way to do 32-bit FP math. Mark Borgerson
On Tue, 7 Oct 2008 14:19:09 -0700, Mark Borgerson
<mborgerson@comcast.net> wrote:

>In article <81ene4hsatqaphkmp01cikmpk9l7ana9qi@4ax.com>, >nobody@spam.prevent.net says... >> On Tue, 7 Oct 2008 08:01:31 -0700, "John Speth" <johnspeth@yahoo.com> >> wrote: >> >> >(Not to start a tools war but) >> > >> >I'm about to start a project that will use the STM32 ARM from ST. IAR and >> >Keil both supply high quality tool sets for the STM32. I've used both >> >toolsets' evaluation copies. I believe I'd be satisfied buying any one over >> >the other. >> > >> >Can anyone make any comments why one might be better than the other? >> > >> >At this point, it's a flip of the coin. I'd like to hear some practical >> >opinions of current users to help tip the scales. >> > >> >> Keil has been bought by ARM, and AFAIK they now use the compiler from >> ARM. Apparently this compiler generates the best code for ARM's CPUs. >> GCC Generates quite good code for the ARM these days. The biggest >> drawback is the use of newlib. Rowley provides a nice IDE with GCC, >> and their own library, which removes the one disadvantage of using >> GCC. Their product is also available for Windows and Linux. >> >> >One of the biggest problems I ran into with GCC-ARM when using the >linux libraries for the TRITON boards, is that floating point operations >are executed as kernel interrupts (Undefined Instruction generating >jumps to a floating point emulator, I think.) I think it turned >out to be several times slower than the IAR floating point library >that runs in user mode. > >I suppose it makes a big difference whether your ARM code is going >to run under Linux or on the bare silicon. All my IAR experience >is on ARM7TDMI without an OS, and all my GCC experience is >on a StrongArm chip under linux. I'd love to transition to >an RTOS with either system, but haven't found the free time to >get either MicroC/OSII or FreeRtos running with either set >of hardware.
Look at http://www.linuxdevices.com/articles/AT5920399313.html on how to get much faster ARM floating point under Linux. The speedup was over 10x for the floating point benchmarks used in the article. There was also a big spped improvment going from around 2.x gcc to around 3.x gcc. I cannot recall the exact versions. Someone rewrote the ARM floating point library for gcc in optimized assembler. Most of the comparisons floating around between the comercial ARM compilers and gcc, compares with old gcc 2.x compilers. We are nou at gcc 4.x, which is MUCH faster than the old 2.x versions. It also helps to enable even the minimum level of optimization in gcc. Regards Anton
donald wrote:

> I for one would like to move past windoze, but noone will pay me to > re-learn how to develop code.
What makes you think that switching to something other than Windows would invalidate all your knowledge about developing code? Is your experience really so limited that you would lose it all just by running the compiler on a different platform?
> I can figure out how to run most any compiler/development suite under > windoze.
So why do you think you couldn't do the same on Linux?
> I have never run a linux machine. > It would not take too long, however every time I try to get a job > without linux/unix experience on my resume, I am told I do not qualify. > > So I stay in the windoze world.
There's a major logic flaw in that line of reasoning. You're effectively saying you stay with Windows because employers all require you do know Unix...
Mark Borgerson wrote:

> Almost all my analysis tools, as well as IAR ARM and ICC430 run > under WinXP, so getting running GCC-ARM, that I was forced to use > for another project, means either running GCC-ARM in a VirtualBox > Ubuntu emulator, or toting a separate Linux laptop on business > trips.
Erm --- how exactly did you arrive at the conclusion that you couldn't run ARM GCC on Windows, too?
> I've managed to get the code development done, but I'm > still cursing the day I was told to use Linux in a hard real-time > system!
What does the choice of embedded target system have to do with that of the development host platform?
On 2008-10-08, Hans-Bernhard Br&#2013266166;ker <HBBroeker@t-online.de> wrote:

>> I've managed to get the code development done, but I'm still >> cursing the day I was told to use Linux in a hard real-time >> system! > > What does the choice of embedded target system have to do with that of > the development host platform?
Running the compiler on Windows is one thing. Running the reset of the build environment required to put together a disk image for a Linux system is a lot more difficult. Building embedded Linux systems is indeed a lot easier on a *nix host. Yes, you should be able to use Cygwin, but Cygwin is very fragile and doesn't always work well. In my experience, Cygwin is 2-3X slower as well. -- Grant Edwards grante Yow! FUN is never having to at say you're SUSHI!! visi.com
Grant Edwards wrote:
> Hans-Bernhard Br&#2013266166;ker <HBBroeker@t-online.de> wrote: >> Mark Borgerson wrote: >> >>> I've managed to get the code development done, but I'm still >>> cursing the day I was told to use Linux in a hard real-time >>> system! >> >> What does the choice of embedded target system have to do with >> that of the development host platform? > > Running the compiler on Windows is one thing. Running the > reset of the build environment required to put together a disk > image for a Linux system is a lot more difficult. Building > embedded Linux systems is indeed a lot easier on a *nix host. > > Yes, you should be able to use Cygwin, but Cygwin is very > fragile and doesn't always work well. In my experience, Cygwin > is 2-3X slower as well.
Well, if the speed is the problem, try DJGPP. This is quite fast, avoids the GUI interface, and generally runs under Windows or MsDos or FreeDos. Not fragile. -- [mail]: Chuck F (cbfalconer at maineline dot net) [page]: <http://cbfalconer.home.att.net> Try the download section.
In article <gcj4ff$8hj$02$2@news.t-online.com>, HBBroeker@t-online.de 
says...
> Mark Borgerson wrote: > > > Almost all my analysis tools, as well as IAR ARM and ICC430 run > > under WinXP, so getting running GCC-ARM, that I was forced to use > > for another project, means either running GCC-ARM in a VirtualBox > > Ubuntu emulator, or toting a separate Linux laptop on business > > trips. > > Erm --- how exactly did you arrive at the conclusion that you couldn't > run ARM GCC on Windows, too?
By asking the person who sent me the Linux development system if it was possible to run the system under Windows using Cygwin. He told me they didn't know if that would work. That indicated that either it wouldn't work, he didn't know if it would work, or he didn't want to spend the time to get it to work. Thus spaketh the EE PhD from MIT! Who am I, a lowly Msc in Oceanography from Oregon State, to disagree! (Actually, as I later discussed, I did get it to work using the VirtualBox emulator on my WinXP system.)
> > > I've managed to get the code development done, but I'm > > still cursing the day I was told to use Linux in a hard real-time > > system! > > What does the choice of embedded target system have to do with that of > the development host platform? >
Nothing. They were two separate issues. I guess I didn't make that clear. The use of Linux on the embedded system caused many problems-- the first of which was the inability to do PWM output for servo motor control. The hardware didn't have the resources to do the PWM with 10-bit resolution, and it was impossible to do it with bit banging and software delay loops as core interrupts kept extending the pulse width enough to cause serious motor chatter. Mark Borgerson
On Wed, 08 Oct 2008 14:18:31 -0700, Mark Borgerson wrote:
> The use of Linux on the embedded system caused many problems-- > the first of which was the inability to do PWM output for servo motor > control. The hardware didn't have the resources to do the PWM with > 10-bit resolution, and it was impossible to do it with bit banging and > software delay loops as core interrupts kept extending the pulse width > enough to cause serious motor chatter.
This happens to be exactly what EMC is doing: EMC being the Enhanced Machine Controller, a software for numerical control of machinery: http://linuxcnc.org/ The EMC people have extensive experience running exactly this kind of jobs. They do with with RT Linux, and have a lot of material on their website on measuring and optimizing latencies for reliable Real Time operation. -- Przemek Klosowski, Ph.D. <przemek.klosowski at gmail>