EmbeddedRelated.com
Blogs
The 2024 Embedded Online Conference

[ C Programming Techniques: integer type optimization ]

Fabien Le MentecMay 22, 20131 comment

I am currently working on a voltage controller running on a ATMEGA328P, ATMEL AVR 8 bits microcontroller. The controller logic is implemented in the main() routine and relies on a periodical timer whose frequency is fixed at application setup. Among other things, the timer ISR handler increments some per tick counters which are then used by the main routine to implement the voltage controller timing logic.

By looking at the code, one noticed that I use the uint8_t type for counters instead of unsigned int. He enumerated some potential issues involved in the context of this project, and I explained him the reasons and implications. I thought it may be a short but interesting topic to blog on.

There are actually more than one counter, and some additionnal related logic. But in this post, we can assume the ISR handler looks like this:

 
#include < stdint.h >
#include < avr/io.h >

/* current version: */ static volatile uint8_t counter = 0;
/* initial version: static volatile unsigned int counter = 0; */

ISR(TIMER1_COMPA_vect)
{
  /* ... */

  if (counter != TIMER_MS_TO_TICKS(100))
  {
    ++counter;
  }

  /* ... */
}

Due to the timing logic and constraints (very interesting, but would take more time to explain ... maybe for another post), I came to the point I had to optimize the ISR code a bit, and looked for places to reduce the cycle count. This is the reason of using uint8_t instead of unsigned int: the AVR-GCC compiler integer type width is 16 bits by default for the target platform. As ATMEGA328P are 8 bits microcontrollers, one can assume that 8 bits arithmetics lead to a faster code. Lets compare the generated assembly code for the 2 versions:

/* uint8_t version */

#include < stdint.h >
#include < avr/interrupt.h >

static volatile uint8_t counter = 0;

ISR(TIMER1_COMPA_vect)
{
  /* ... */

  /* avr-gcc -mmcu=atmega328p -O2 */
  lds r24,counter
  cpi r24,lo8(100)
  brne .L1
  lds r24,counter
  subi r24,lo8(-(1))
  sts counter,r24
.L1:

  /* ... */
}
 

/* unsigned int version */
#include < avr/interrupt.h >

static volatile unsigned int counter = 0;

ISR(TIMER1_COMPA_vect)
{
  /* ... */

  /* avr-gcc -mmcu=atmega328p -O2 */
  lds r24,counter
  lds r25,counter+1
  cpi r24,100
  cpc r25,__zero_reg__
  brne .L4
  lds r24,counter
  lds r25,counter+1
  adiw r24,1
  sts counter+1,r25
  sts counter,r24
 .L4:

  /* ... */
}
 

As expected, one can see that the instruction count is reduced. Plus, the ATMEL 8 bit AVR instruction set manual (www.atmel.com/images/doc0856.pdf) specifies that the adiw instruction requires 2 clock cycles to complete. Thus, the cycle count of the unsigned int version is twice the uint8_t one.

While it is not the important point, note that the variable volatility adds extra loads and stores that could be removed by using a non volatile local variable, and commit it at the end of the operation.


While changing a variable type changes a single line of code, it has several important implications.

First, it reduces the counter capacity. In this case, I had to make sure that the timing related logic still works when maximum values move from 0xffff to 0xff. Such points are easily missed, so it must be considered carefully. This situation is even worth when you come back on the code to add features long time after is has been written. Here, commenting helps a lot.

A second, less obvious implication, is that the optimization does not work on architectures where the arithmetic word size is not 8 bits. It would still work and compile, but 8 bits arithmetics may have the inverse effects, ie. add extract operations. To solve this issue, one can define a type whose width defaults to the actual architecture word size, as seen from the instruction set point of view:

#if defined(__AVR_ATmega328P__)
typedef uint8_t uint_word_t;
#else
typedef unsigned int uint_word_t;
#endif

EDIT: As pointed to here, the solution is to use the uint_fast8_t type. Thanks to the author of this post.

I am always interested in techniques to reduce cycle count in C codes, especially in ISR handlers. If you have any, please share.



The 2024 Embedded Online Conference
[ - ]
Comment by jbottinger1May 23, 2013
Good advice. Programmers unaware of this sort of stuff definitely need to learn it.

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Please login (on the right) if you already have an account on this platform.

Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: