STM32 ARM toolset advice?| page 8

Reply by Anton Erasmus ●October 16, 20082008-10-16

On Thu, 16 Oct 2008 00:33:02 -0500, Walter Banks
<walter@bytecraft.com> wrote:

>
>
>Anton Erasmus wrote:
>
>> We ported all our 68K Code from commercial compilers to GCC for 68k.
>> This is the same GCC used for Coldfire. The code was significantly
>> faster using GCC.
>>
>
>Anton,
>
>Did you find out where the GCC was faster?

We ported 2 main types of applications. One consists mainly of fairly
complex axis transformations and data filtering, while also handling
low latency comms to a host. The main task is executed at 100Hz 
and the maximum, minimum and avarage execution time is calculated
every interrupt cycle. If I remeber correctly the gcc code was about
30% faster. The axis transformations is mostly scaled integer with a
little bit of floating point.
The second app was a one which displayed moving 2D icons over a live
video image. No graphics acceleration hardware was available.
Everything is done in software in a frame buffer. Again most
calculations were done in fixed poit, with a little bit of floating
point. If I remember correctly, it was overall 20% faster, with some
low level graphic primitive routines almost 80% faster. 
On the graphics routines, gcc did much better at register allocation. 

>
>Which compilers?

It was SDS and Microtec compilers. Both the compilers got more and
more expensive over time. Initially their were big  improvments in new
versions. We stopped purchasing support when the new versions
basically did nothing much over the previous versions, Microtec also
started adding copy protection, which became a total pain to work
with. 

Regards
  Anton Erasmus

Reply by Mark Borgerson ●October 16, 20082008-10-16

In article <48F6D20E.8335FFB8@bytecraft.com>, walter@bytecraft.com 
says...
> 
> 
> Anton Erasmus wrote:
> 
> > We ported all our 68K Code from commercial compilers to GCC for 68k.
> > This is the same GCC used for Coldfire. The code was significantly
> > faster using GCC.
> >
> 
> Anton,
> 
> Did you find out where the GCC was faster?
> 
> Which compilers?
> 
I've not used GCC for the M68K, but I have many years experience with
Codewarrior 68K.  I was able to speed up some  loops by factors near
two by using the DBRA  (decrement and branch)   instruction in assembly
language rewrites of C code.   This was  generally only necessary in
very tight loops for high speed data collection.   The instruction
set and architecture of the M68K made assembly language routines 
much simpler to write than is the case with the ARM.

The other common problem with Codewarrior (and other compilers I've
used) is that there seem to be a lot of redundant register loads
from stack-based variables.   This may be because I generally
set optimization to the lowest level.  That generally makes it
easier to read the assembly language output and single-step through the 
code.


Mark Borgerson

Reply by David Brown ●October 17, 20082008-10-17

Mark Borgerson wrote:
> In article <48F6D20E.8335FFB8@bytecraft.com>, walter@bytecraft.com 
> says...
>>
>> Anton Erasmus wrote:
>>
>>> We ported all our 68K Code from commercial compilers to GCC for 68k.
>>> This is the same GCC used for Coldfire. The code was significantly
>>> faster using GCC.
>>>
>> Anton,
>>
>> Did you find out where the GCC was faster?
>>
>> Which compilers?
>>
> I've not used GCC for the M68K, but I have many years experience with
> Codewarrior 68K.  I was able to speed up some  loops by factors near
> two by using the DBRA  (decrement and branch)   instruction in assembly
> language rewrites of C code.   This was  generally only necessary in
> very tight loops for high speed data collection.   The instruction
> set and architecture of the M68K made assembly language routines 
> much simpler to write than is the case with the ARM.
> 
> The other common problem with Codewarrior (and other compilers I've
> used) is that there seem to be a lot of redundant register loads
> from stack-based variables.   This may be because I generally
> set optimization to the lowest level.  That generally makes it
> easier to read the assembly language output and single-step through the 
> code.
> 

You set the compiler flags for low optimisation, and are surprised by 
getting sub-optimal code?

When you need to read or single-step generated assembly, it's often best 
not to have too low optimisation (or too high) - all these redundant 
stack accesses make the code hard to follow.

Reply by Paul Black ●October 17, 20082008-10-17

On Oct 16, 10:56=A0pm, Mark Borgerson wrote:
> The other common problem with Codewarrior (and other compilers I've
> used) is that there seem to be a lot of redundant register loads
> from stack-based variables. =A0 This may be because I generally
> set optimization to the lowest level. =A0That generally makes it
> easier to read the assembly language output and single-step through the
> code.

Why are you single stepping the machine instructions of the compiler
output so much that this is an issue? Is your compiler unreliable?

Paul

Reply by Mark Borgerson ●October 17, 20082008-10-17

In article <d24a67aa-bc21-43c9-9d63-fd6b1cee0436
@y29g2000hsf.googlegroups.com>, lacuna@saturnine.org.uk says...
> On Oct 16, 10:56=A0pm, Mark Borgerson wrote:
> > The other common problem with Codewarrior (and other compilers I've
> > used) is that there seem to be a lot of redundant register loads
> > from stack-based variables. =A0 This may be because I generally
> > set optimization to the lowest level. =A0That generally makes it
> > easier to read the assembly language output and single-step through the
> > code.
>=20
> Why are you single stepping the machine instructions of the compiler
> output so much that this is an issue? Is your compiler unreliable?
>=20

When I'm working on peripheral data transfers where I want=20
to transfer as quickly as possible,  I quite often look
at the generated assembly language.  I never did find an
optimization level for the M68K compiler where it
used the DBRA instructions.   Another reason that
I keep the Codewarrior M68K compiler at a low optimization
level is that it was recommended by the SBC vendor.  This
may have something to do with the fact that the compiler
was really targeted for the PalmOS, but was being used
with another vendor's libraries and hardware.  There was
a time when you could get Codewarrior for the PalmOS for
about $400, while the standard Codewarrior M68K was over
$2000.

I generally don't step through the M68K code, as the
SBC that I use doesn't have good debug facilities.

I do sometimes step through MSP430 code using a JTAG
debugger.  The compiler that I use (Imagecraft) doesn't
have a lot of optimization choices---but does
have some redundant register loads.

Sorry if I got the two different cases mixed up in the
original post.=20

Mark Borgerson

Reply by Mark Borgerson ●October 17, 20082008-10-17

In article <SKWdnS2drL8lt2XVnZ2dneKdnZydnZ2d@lyse.net>, 
david.brown@hesbynett.removethisbit.no says...
> Mark Borgerson wrote:
> > In article <48F6D20E.8335FFB8@bytecraft.com>, walter@bytecraft.com 
> > says...
> >>
> >> Anton Erasmus wrote:
> >>
> >>> We ported all our 68K Code from commercial compilers to GCC for 68k.
> >>> This is the same GCC used for Coldfire. The code was significantly
> >>> faster using GCC.
> >>>
> >> Anton,
> >>
> >> Did you find out where the GCC was faster?
> >>
> >> Which compilers?
> >>
> > I've not used GCC for the M68K, but I have many years experience with
> > Codewarrior 68K.  I was able to speed up some  loops by factors near
> > two by using the DBRA  (decrement and branch)   instruction in assembly
> > language rewrites of C code.   This was  generally only necessary in
> > very tight loops for high speed data collection.   The instruction
> > set and architecture of the M68K made assembly language routines 
> > much simpler to write than is the case with the ARM.
> > 
> > The other common problem with Codewarrior (and other compilers I've
> > used) is that there seem to be a lot of redundant register loads
> > from stack-based variables.   This may be because I generally
> > set optimization to the lowest level.  That generally makes it
> > easier to read the assembly language output and single-step through the 
> > code.
> > 
> 
> You set the compiler flags for low optimisation, and are surprised by 
> getting sub-optimal code?
> 
> When you need to read or single-step generated assembly, it's often best 
> not to have too low optimisation (or too high) - all these redundant 
> stack accesses make the code hard to follow.
> 
> 


I seem to recall a classic example from an early 8051 compiler:   If you
set optimization high and to minimize memory,  it would overlay 
variables in the limited RAM space.  That made reading the assembly
language pretty confusing at times.


Mark Borgerson

Reply by Walter Banks ●October 18, 20082008-10-18

Mark Borgerson wrote:

> I seem to recall a classic example from an early 8051 compiler:   If you
> set optimization high and to minimize memory,  it would overlay
> variables in the limited RAM space.  That made reading the assembly
> language pretty confusing at times.
>
> Mark Borgerson

Mark,

The assembly can look confusing, but in a well implemented compiler
the variable can be followed by symbolic name as the compiled code
walks through the code. Physical RAM locations contain different
variables depending on the current PC value. The ChipTools  8051
symbolic debuggers did a good job of tracking code in Keil's 8051
compiler as early as the mid 90's

The source level debugging code should be able to track a variable
even when it temporarily resides in a register. This resolves cases
where the local variable location is reassigned instead of being moved;

x and y both local

y = x;
x = 29;

This code should not generate any code for y = x only a symbol table
change and source level debug reference change..

Regards,

--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com

Reply by Mark Borgerson ●October 18, 20082008-10-18

In article <48F9EE2C.CA6F7E19@bytecraft.com>, walter@bytecraft.com 
says...
> 
> 
> Mark Borgerson wrote:
> 
> > I seem to recall a classic example from an early 8051 compiler:   If you
> > set optimization high and to minimize memory,  it would overlay
> > variables in the limited RAM space.  That made reading the assembly
> > language pretty confusing at times.
> >
> > Mark Borgerson
> 
> Mark,
> 
> The assembly can look confusing, but in a well implemented compiler
> the variable can be followed by symbolic name as the compiled code
> walks through the code. Physical RAM locations contain different
> variables depending on the current PC value. The ChipTools  8051
> symbolic debuggers did a good job of tracking code in Keil's 8051
> compiler as early as the mid 90's

The early 90's is about the time frame that I was using the 8051.
IIRC, it was a small form factor package with only about 2K
of EPROM.   At the time I was using that 8051 chip, a PIC variant,
the MC68HC16, and the M68K.   I TRIED to stick with one chip or another
for at least a week to minimize the context switch overhead, but
was not generally successful.   IIRC, debugggers at that time 
generally involved external hardware with emulator pods---which
were well above the company budget limits.
> 
> The source level debugging code should be able to track a variable
> even when it temporarily resides in a register. This resolves cases
> where the local variable location is reassigned instead of being moved;
> 
> x and y both local
> 
> y = x;
> x = 29;
> 
> This code should not generate any code for y = x only a symbol table
> change and source level debug reference change..

I expect that if I ever go back to an 8051 variant, I will better 
understand the development system and expect better debugging 
facilities.   However, as I'm in a low-volume market where
unit cost is not a major constraint,  I'll probably stick with
the MSP430 series for very low power systems and one or another
of the ARM series where I need more processing power.


Mark Borgerson

Previous 6 78Next

STM32 ARM toolset advice?

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group