Sign in

username:

password:



Not a member?

Search Comp.Arch.Embedded



Search tips

embedded by Keywords

68HC11 | 68HC12 | 8051 | 8052 | ARM | ARM7 | Asic | AT91 | AT91RM9200 | Atmel | AVR | AVRStudio | Bootloader | CFP | CompactFlash | Cygnal | Cypress | Dataflash | DSP | eCos | EEPROM | Embedded Linux | Emulator | Endian | Ethernet | Firewire | FPGA | Freescale | GCC | GNUARM | GSM | H8 | HDLC | I2C | Infineon | Interrupts | Java | JTAG | LCD | LED | LPC2000 | MCU | Microchip | MMC | MPLAB | MSP430 | PC104 | PCB | PCI | PCMCIA | PowerPC | Rabbit | RS232 | RS485 | RTOS | SBC | SDRAM | Sensor | SPI | STK500 | UART | UML | USART | USB | Verilog | VHDL | VxWorks | Xilinx

Ads

Discussion Groups

Discussion Groups | Comp.Arch.Embedded | code optimiation

There are 57 messages in this thread.

You are currently looking at messages 10 to 20.

Re: code optimiation - David Brown - 15:19 10-05-08

Tomás Ó hÉilidhe wrote:
> On May 10, 9:11 am, "aamer" <raqeeb...@yahoo.com> wrote:
>> Dear all,
>>
>> Are there any hard and fast rules for code optimization in C targetting a
>> processor.
> 
> 
> I'd advocate using types like "uint_fast8_t" instead of "unsigned
> int"; that way you'll get good performance out of all kinds of
> machine, whether they be 8-Bit, 16-Bit or 5-billion-Bit. For instance
> if you use "unsigned int" on an 8-Bit microcontroller where an 8-Bit
> integer would suffice, then your code will be at least twice as slow
> because multiple instructions are used everytime you do simple
> arithmetic.
> 

Using the "fast" types can make sense, especially for speed-critical 
code.  There are advantages in using the size-specific types, however - 
specifying "uint8_t" rather than "uint_fast8_t" may let the compiler (or 
linter) spot range errors that would not be found if "uint_fast8_t" 
boils down to a 32-bit value.  Given that the compiler can often 
optimise the generated code to use the best sized types available, it's 
seldom worth specifying "fast" types explicitly.

> Also I'd advocate using "built-in" parts of the language where
> possible, e.g.:
> 
>     unsigned arr[12] = {0};
> 

That's good advice, except that using "unsigned" contradicts your 
previous advice.  Personally, I dislike abbreviated types like 
"unsigned" - I always write the implicit "int" explicitly.

> instead of:
> 
>     unsigned arr;
>     memset(arr,0,sizeof arr);
> 

I presume you meant "unsigned arr[12];" here.

The main reason for using the {} initialiser rather than memset() or 
other methods is that it gives clearer and shorter source code - smaller 
and faster object code is a bonus (in some circumstances, compilers will 
optimise the memset() call to the same code anyway).

> (Also the former is fully portable for dealing with types like
> pointers and floating point types whose "zero value" might not be all-
> bits-zero)
> 

It is virtually impossible to write fully portable code - and totally 
impossible within the world of embedded programming.  Forget the 
machines that have weird values for zeros, or bizarre numbers of bits 
(although some DSP's have 16-bit or 32-bit chars), or something other 
than two's complement arithmetic, or non-ASCII for their basic character 
set.  It's not worth it - code suitable for an ARM is not suitable for 
running on a 1970's mainframe anyway.

> Another thing would be about the use of the post-increment and post-
> decrement operators in a conditional. For instance:
> 
> void strcpy(char *dst, char const *src)
> {
>     while (*dst++ = *src++);
> }
> 
> The idiom of using *p++ is widespread, but unfortunately its use is no
> longer advisable because hardware has moved on. I think it was the
> PDP11 that had a single instruction for dereferencing a pointer and
> also incrementing it at the same time, thus it was beneficial to use *p
> ++ wherever possible -- however modern machines don't have such an
> instruction, so the assembler produced for *p++ when used as the
> conditional in an if statement, for instance, might be sub-optimal. So
> I'd say opt for:
> 
>     for ( ; *dst = *src; ++dst, ++src) ;
> 

In any code review, that form would be taken out and shot.  Just because 
it is legal in C to write an ugly mess inside a for() statement, does 
not mean that it is sensible to write it.  It's not even going to 
produce smaller or faster code - any compiler that can't produce tight 
code for the original while() will produce poor code from this construct 
too.

The first idiom is so commonly used that it is clear to any reader - 
although I'd have two sets of parenthesis (gcc convention to disable a 
warning) and perhaps a comment to say that I really meant a single "=".

For less capable compilers, you are probably better with:

	while ((*dst = *src)) {
		dst++;
		src++;
	}

That's far clearer to the reader, and easier for a less sophisticated 
compiler.

It's always important to examine the generated assembly code, and learn 
to know your target architecture and your compiler's idiosyncrasies if 
you want to get the best from it - don't guess randomly at the most 
obfuscated expression you can think of.


> Moving on...
> 
> On most machines, I would use pointers instead of element indices for
> iterating thru an array. For example:
> 
>     char *p = arr;
>     char const *const pend = arr + LENGTH;
> 
>     do if ('a' == *p) return 1;
>     while (pend != ++p);
> 
> intead of:
> 
>     unsigned i = 0;
> 
>     do if ('a' == arr[i]) return 1;
>     while (LENGTH != ++i);
> 

First off, get yourself a decent compiler.  It will do the same job, and 
let you write the source code using proper array constructs.

Secondly, don't write a loop like that (first or second forms) without 
using brackets - it's unclear, and it changes can easily break the code.

Third, forget the silly "if (constant == variable)" form of expression 
unless you are working for MISRA nazis (i.e., those that think the rules 
are unbendable).  The logical and sensible ordering when reading such a 
comparison is normally "if (variable == constant)".  If your compiler 
does not spot mistakes such as using a single "=" when you meant "==", 
get a better compiler or a better linter.


> The latter, on most architectures, is a hell of a lot slower. But then
> again there are some PC's that have a single instruction for "pointer
> + offset", so I can't discredit that technique altogether.
> 

Do you have any evidence whatsoever for such a wild claim?  A good 
compiler will use pointer instructions for array access, and will do the 
strength reduction turning the array loop into an incrementing pointer. 
  Also, there are plenty of current modern architectures that have array 
memory modes that will be used as appropriate.

> On all architectures, I advocate the use of look-up tables instead of
> switch statements where applicable, especially when it's possible to
> have a look-up table containing function pointers.
> 

You can advocate all you want - fortunately most people will ignore you. 
  The compiler will almost always generate better code for common switch 
cases than a lookup table - and will generate a jump table automatically 
as necessary.  This will be significantly smaller and faster than a 
lookup table of function pointers.  (There are plenty of good reasons 
for using a table of function pointers as a code construct - it's just 
that replacing switch statements is not one of them.)

> If you're ever dealing with a struct that has a lot of information in
> it which is common to a "type", then it might be advisable to follow C+
> +'s idom of removing that stuff from the struct and replacing it with
> a pointer to a single object which contains all the relevant
> information for that time (a V-Table, that is).
> 

It *might* be, but it sounds very unlikely.  What you describe is not a 
C++ idiom, and it's not a vtable - you are describing static data members.

> Emmm they're the main ones that come to mind right now.



Re: code optimiation - Nils - 15:40 10-05-08

My main-points (for speed and size) are:

1. Benchmark what's worth optimizting

2. Do all algorithmic optimizations first.

3. Get a good compiler. If you're using GCC consider compiling a new one.

4. Learn what the restrict-keyword from C99 does. Most compiles support 
it these days. Use restrict whenever possible, but never if you're not 
sure if it can be applied.

5. Don't use unsigned integers for loop-variables unless you need the 
wrap-around feature.

6. Let the compiler decide what to inline and what not. Don't inline 
functions just because you think the code will benefit from it.

7. Embedded CPU's often have small caches and slow external memory. Try 
to keep your working-set small. Packing multiple booleans or enums in a 
single integer may look dirty (less so if you hide the dirty details 
with macros), but if it can increase the cache efficiency a lot.

And last: It's not worth to outsmart the compiler. Changing loops from 
indexing to pointer increment style is not worth it anymore. The 
compiler will do this job for you.

Re: code optimiation - Dombo - 16:46 10-05-08

Nils wrote:

0. Starting with clear, well structured, code will help if you need to 
optimize it later.

> 1. Benchmark what's worth optimizting

Also use the benchmark to determine if there is a performance issue in 
the first place. Though aiming for efficient code is a lofty goal, other 
goals like correctness, robustness, maintainability, clarity...etc are 
often at least as (and usually more) important. No one will care how 
fast your code can produce incorrect results.

> 2. Do all algorithmic optimizations first.

Algorithmic optimizations can improve performance by orders of 
magnitude, code optimizations rarely improve performance by more than 
30% and usually much less than that.

> 3. Get a good compiler. If you're using GCC consider compiling a new one.
> 
> 4. Learn what the restrict-keyword from C99 does. Most compiles support 
> it these days. Use restrict whenever possible, but never if you're not 
> sure if it can be applied.
> 
> 5. Don't use unsigned integers for loop-variables unless you need the 
> wrap-around feature.
> 
> 6. Let the compiler decide what to inline and what not. Don't inline 
> functions just because you think the code will benefit from it.
> 
> 7. Embedded CPU's often have small caches and slow external memory. Try 
> to keep your working-set small. Packing multiple booleans or enums in a 
> single integer may look dirty (less so if you hide the dirty details 
> with macros), but if it can increase the cache efficiency a lot.
> 
> And last: It's not worth to outsmart the compiler. Changing loops from 
> indexing to pointer increment style is not worth it anymore. The 
> compiler will do this job for you.

Or more general: never assume some 'clever trick' will generate faster 
or smaller code - instead prove that the 'clever trick' will yield the 
desired effect. With prove I mean measure (before and after) and/or 
check the compiler output (which also helps to develop a feel what is 
expensive and what not). Also remember that not all compilers are alike; 
some compilers optimize certain code sequences better than others.

I have seen too many examples of people obfuscating the source code 
assuming they are helping the compiler to generate more efficient code, 
while in reality they made things performance wise no better and 
sometimes even worse.

Re: code optimiation - Tom - 18:27 10-05-08

In article <6...@mid.uni-berlin.de>, Nils <n...@cubic.org> wrote:
>My main-points (for speed and size) are:
>5. Don't use unsigned integers for loop-variables unless you need the 
>wrap-around feature.

Not necessarily true:

...................    while (my_unsigned8var < 5) {}
1BF0:  MOVF   x85,W
1BF2:  SUBLW  04
1BF4:  BNC   1BF8
1BF6:  BRA    1BF0
...................    while (my_signed8var < 5) {}
1BF8:  BTFSC  x86.7
1BFA:  BRA    1C02
1BFC:  MOVF   x86,W
1BFE:  SUBLW  04
1C00:  BNC   1C04
1C02:  BRA    1BF8
...................

On this particular combination of target and compiler (PIC18 with CCS C) 
unsigned in always faster than signed.


Re: code optimiation - Grant Edwards - 19:06 10-05-08

On 2008-05-10, Tom <t...@nospam.com> wrote:
> In article <6...@mid.uni-berlin.de>, Nils <n...@cubic.org> wrote:
>>My main-points (for speed and size) are:
>>5. Don't use unsigned integers for loop-variables unless you need the 
>>wrap-around feature.
>
> Not necessarily true:
>
> ...................    while (my_unsigned8var < 5) {}
> 1BF0:  MOVF   x85,W
> 1BF2:  SUBLW  04
> 1BF4:  BNC   1BF8
> 1BF6:  BRA    1BF0
> ...................    while (my_signed8var < 5) {}
> 1BF8:  BTFSC  x86.7
> 1BFA:  BRA    1C02
> 1BFC:  MOVF   x86,W
> 1BFE:  SUBLW  04
> 1C00:  BNC   1C04
> 1C02:  BRA    1BF8
> ...................
>
> On this particular combination of target and compiler (PIC18 with CCS C) 
> unsigned in always faster than signed.

I've seen various other compilers/targets where use of unsigned
loop indexes is faster.  For example, one of the tips/tricks
listed when using GCC for the MSP430 target:

  Tips and trick for efficient programming

    [...]
    
    10. Use unsigned int for indices - the compiler will snip
        _lots_ of code.

On second thought, that might be refering to array indexes
instead of loop indexes.  Hmm...
        
-- 
Grant Edwards                   grante             Yow! I want to read my new
                                  at               poem about pork brains and
                               visi.com            outer space ...

Re: code optimiation - =?ISO-8859-1?Q?Tom=E1s_=D3_h=C9ilidhe?= - 22:56 10-05-08

On May 10, 8:19=A0pm, David Brown
<david.br...@hesbynett.removethisbit.no> wrote:

> It is virtually impossible to write fully portable code - and totally
> impossible within the world of embedded programming. =A0Forget the
> machines that have weird values for zeros, or bizarre numbers of bits
> (although some DSP's have 16-bit or 32-bit chars), or something other
> than two's complement arithmetic, or non-ASCII for their basic character
> set. =A0It's not worth it - code suitable for an ARM is not suitable for
> running on a 1970's mainframe anyway.


I write fully-portable code all the time and I find it to be a simple
task a lot of the time. The C Standard provides you with plenty of
information to write fully-portable algorithms and programs.

Re: code optimiation - =?ISO-8859-1?Q?Tom=E1s_=D3_h=C9ilidhe?= - 23:13 10-05-08


I myself only use signed integer types when I need to store negative
numbers. Other reasons for going with unsigned are:
1) With signed integer types, you get undefined behaviour upon
overflow.
2) On machines other than two's complement, arithmetic can be less
efficient with signed.
3) You can be left with a trap representation if you play around with
the bits of a signed, depending on the system.

I see signed integer types as nasty and so I only use them when I
really have to.

Re: code optimiation - Tim Wescott - 23:25 10-05-08

On Sat, 10 May 2008 19:56:03 -0700, Tomás Ó hÉilidhe wrote:

> On May 10, 8:19 pm, David Brown
> <david.br...@hesbynett.removethisbit.no> wrote:
> 
>> It is virtually impossible to write fully portable code - and totally
>> impossible within the world of embedded programming.  Forget the
>> machines that have weird values for zeros, or bizarre numbers of bits
>> (although some DSP's have 16-bit or 32-bit chars), or something other
>> than two's complement arithmetic, or non-ASCII for their basic
>> character set.  It's not worth it - code suitable for an ARM is not
>> suitable for running on a 1970's mainframe anyway.
> 
> 
> I write fully-portable code all the time and I find it to be a simple
> task a lot of the time. The C Standard provides you with plenty of
> information to write fully-portable algorithms and programs.

"I'm only 21 years of age".

Chances are good that you're pontificating to someone who's been earning 
money at this game since before you were an orgasm.

So is your experience vast, or your statement half-vast?

-- 
Tim Wescott
Control systems and communications consulting
http://www.wescottdesign.com

Need to learn how to apply control theory in your embedded system?
"Applied Control Theory for Embedded Systems" by Tim Wescott
Elsevier/Newnes, http://www.wescottdesign.com/actfes/actfes.html

Re: code optimiation - Eric Smith - 03:51 11-05-08

Walter Banks <w...@bytecraft.com> writes:
> Compiler texts in general devote
> a lot of space to parsing an activity that takes a small
> fraction of the time to implement a compiler.

Most compilers don't need optimization, since most compilers are for
things other than general-purpose programming languages.  Therefore
far more compiler writers need to know about parsing than need to
know about optimization.

Try books like _Advanced Compiler Design and Implementation_ by
Steven Muchnick.

Eric

Re: code optimiation - Walter Banks - 05:20 11-05-08


Eric Smith wrote:

> Walter Banks <w...@bytecraft.com> writes:
> > Compiler texts in general devote
> > a lot of space to parsing an activity that takes a small
> > fraction of the time to implement a compiler.
>
> Most compilers don't need optimization, since most compilers are for
> things other than general-purpose programming languages.  Therefore
> far more compiler writers need to know about parsing than need to
> know about optimization.

This is a good point.

> Try books like _Advanced Compiler Design and Implementation_ by
> Steven Muchnick.

It was one of the books I was referring to that has
good descriptions of  individual optimization techniques but
deal with optimization management and application
level optimization strategy very well.

Regards

--
Walter Banks
Byte Craft Limited
Tel. (519) 888-6911
http://www.bytecraft.com
w...@bytecraft.com






previous | 1 | 2 | 3 | 4 | 5 | 6 | next