EmbeddedRelated.com
Forums
The 2024 Embedded Online Conference

IAR Compiler and volatile keyword

Started by rawjoeshaw March 14, 2007
I have three functions below that *should* do the same thing. In
fact, they do the same thing on my Linux box gcc plain jane standard
compiler (substitute printf for SPI_Writes). However, in EWARM, they
do not do the same thing when programmed to my LPC2106. Only func3
behaves as expected.

My functions are:

void func1(unsigned long x, unsigned int y) {
unsigned char a[3];
a[0] = ((unsigned char*)(&x))[2];
a[1] = ((unsigned char*)(&x))[1];
a[2] = ((unsigned char*)(&x))[0];
x += y;
SPI_Write(a, 3);
a[0] = ((unsigned char*)(&x))[2];
a[1] = ((unsigned char*)(&x))[1];
a[2] = ((unsigned char*)(&x))[0];
x += y;
SPI_Write(a, 3);
}

void func2(unsigned long x, unsigned int y) {
unsigned char a[3];
a[0] = *((unsigned char*)(&x)+2);
a[1] = *((unsigned char*)(&x)+1);
a[2] = *((unsigned char*)(&x));
x += y;
SPI_Write(a, 3);
a[0] = *((unsigned char*)(&x)+2);
a[1] = *((unsigned char*)(&x)+1);
a[2] = *((unsigned char*)(&x));
x += y;
SPI_Write(a, 3);
}

void func3(unsigned long x, unsigned int y) {
unsigned char a[3];
a[0] = x >> 16;
a[1] = x >> 8;
a[2] = x;
x += y;
SPI_Write(a, 3);
a[0] = x >> 16;
a[1] = x >> 8;
a[2] = x;
x += y;
SPI_Write(a, 3);
}

func1 and func2 do not show a change in the values written over SPI,
whereas func 3 shows the changed values.

func1 and func2 can be remedied to work correctly by adding the
volatile keyword in front of the variables in the function prototype.
Then all three functions behave the same.

So my question is, what is going on. None of these are threaded
functions, array a and x aren't modifiable outside of the scope of the
function. So why the volatile keyword? Is it a bug in the compiler?
Is there something going on that I just don't see, specific to this
architecture? Or am I just an idiot at C? Thanks.

An Engineer's Guide to the LPC2100 Series

rawjoeshaw wrote:
> I have three functions below that *should* do the same thing. In
> fact, they do the same thing on my Linux box gcc plain jane standard
> compiler (substitute printf for SPI_Writes). However, in EWARM, they
> do not do the same thing when programmed to my LPC2106. Only func3
> behaves as expected.

The function that works in either condition performs 32-bit
operations and then assigns to an 8-bit value.

The functions that don't work use byte-addressing into an
integer value.

And the code works on an 'Linux box' ... which I suspect happens
to be little-endian x86; where accessing the bytes of an integer
happens to work ...

So my first guess is that this is an endianness issue ...

hex bytes in memory: 12 34 56 78
little-endian integer: 78563412
big-endian integer: 12345678

Conversely, pass the integer 12345678 to a function, and the bytes

on a little-endian: 78 56 34 12

on a big-endian: 12 34 56 78

On either machine type manipulating the integer 12345678 with
integer shifts will give you the correct bytes, whereas accessing
the bytes via pointers to the integer word in memory is
machine specific.

The difference observed when you added the volatile keyword is
best investigated by using objdump -d to look at the assembly code.
There is probably some indirect addressing that used to access
the bytes in the integer, that gets converted to register-only
manipuations.

Welcome to the wonders of machine binary representations :)

I can't recall if the LPC is little or big-endian. So this could
all be wrong for your case ... but since you asked ...

Dave
Try using type punning for a more pointer-aliasing-safe way of doing this.

union int_char_union {
unsigned long x;
unsigned char c[4];
}

I believe this is required to work, but I'm not clear on exactly what
the rules are.

-Ed
At 10:02 PM 3/14/2007 +0000, rawjoeshaw wrote:
>I have three functions below that *should* do the same thing. In
>fact, they do the same thing on my Linux box gcc plain jane standard
>compiler (substitute printf for SPI_Writes). However, in EWARM, they
>do not do the same thing when programmed to my LPC2106. Only func3
>behaves as expected.
>
>My functions are:
>
>void func1(unsigned long x, unsigned int y) {
> unsigned char a[3];
> a[0] = ((unsigned char*)(&x))[2];
> a[1] = ((unsigned char*)(&x))[1];
> a[2] = ((unsigned char*)(&x))[0];
> x += y;

Seems straightforward but consider what an optimizer might do. It could
rearrange it as follows

x += y;
a[0]=((unsigned char*)(&x))[2];
a[1]=((unsigned char*)(&x))[1];
a[2]=((unsigned char*)(&x))[0];

Adding volatile basically tells the compiler it isn't allowed to re-arrange
the operations. The fact that adding volatile makes a difference is telling.

There is enough information there I would have expected the optimizer to be
able to tell that re-arranging the operations is invalid. I suspect the
optimizer is a bit broken. Try reducing the optimization level and see if
the problem disappears, that (and a quick perusal of the generated asm)
would just about clinch it.

Robert

"C is known as a language that gives you enough rope to shoot yourself in
the foot." -- David Brown in comp.arch.embedded
http://www.aeolusdevelopment.com/
At 03:29 PM 3/14/2007 -0700, David Hawkins wrote:
>rawjoeshaw wrote:
> > I have three functions below that *should* do the same thing. In
> > fact, they do the same thing on my Linux box gcc plain jane standard
> > compiler (substitute printf for SPI_Writes). However, in EWARM, they
> > do not do the same thing when programmed to my LPC2106. Only func3
> > behaves as expected.



>And the code works on an 'Linux box' ... which I suspect happens
>to be little-endian x86; where accessing the bytes of an integer
>happens to work ...
>
>So my first guess is that this is an endianness issue ...

The LPCs are little endian. Although ARMs in general can be either.

Robert

http://www.aeolusdevelopment.com/

From the Divided by a Common Language File (Edited to protect the guilty)
ME - "I'd like to get Price and delivery for connector Part # XXXXX"
Dist./Rep - "$X.XX Lead time 37 days"
ME - "Anything we can do about lead time? 37 days seems a bit high."
Dist./Rep - "that is the lead time given because our stock is live.... we
currently have stock."
>> So my first guess is that this is an endianness issue ...
>
> The LPCs are little endian. Although ARMs in general can be either.

I knew the ARM core could be both, but couldn't remember
which the LPC implemented.

I'm working on FPGA code that interfaces to a big-endian
PowerPC, using little-endian-centric FPGA JTAG debugging tools,
so endianness issues are currently near-and-dear ;)

Cheers
Dave
1:02:40 AM, Thursday, March 15, 2007, rawjoeshaw wrote:

> I have three functions below that *should* do the same thing.

> My functions are:

> So my question is, what is going on.

Try to look into .lst file generated for your functions. It can give
you some idea about what is going on.

WBR, Alex
Hi All and thanks for the responses.

Considering everything, I'll show what I've learned. I figure some people would be interested.

I'm compiling with low optimization. Compiling with high optimization fixed the problem. So if I let the compiler optimize more, no need for fixes. Go figure.

I'll show the disassembly below. For the disassembly, I've created a simpler function that replicates the error. It is

void AT45_Read(unsigned long addr, unsigned char* buffer, unsigned int len) {
volatile unsigned long tempLong; //use volatile keyword for it to work.
tempLong = AT45_DeviceID();
buffer[0] = *((unsigned char*)(&tempLong)+0);
buffer[1] = *((unsigned char*)(&tempLong)+1);
buffer[2] = *((unsigned char*)(&tempLong)+2);
buffer[3] = *((unsigned char*)(&tempLong)+3);
buffer[5] = 0x00;
}

Here is the disassembly from the working code:
void AT45_Read(unsigned long addr, unsigned char* buffer, unsigned int len) {
Next label is a Thumb label
AT45_Read:
0x00006D7C B510 PUSH {R4, LR}
0x00006D7E B081 SUB SP, SP, #4
0x00006D80 1C0C MOV R4, R1
tempLong = AT45_DeviceID();
0x00006D82 F7FF ; pre BL/BLX
0x00006D84 FFD9 BL AT45_DeviceID ; 0x6D38
0x00006D86 9000 STR R0, [SP, #0]
buffer[0] = *((unsigned char*)(&tempLong)+0);
0x00006D88 4668 MOV R0, SP
0x00006D8A 7800 LDRB R0, [R0, #0]
0x00006D8C 7020 STRB R0, [R4, #0]
buffer[1] = *((unsigned char*)(&tempLong)+1);
0x00006D8E 4668 MOV R0, SP
0x00006D90 7840 LDRB R0, [R0, #1]
0x00006D92 7060 STRB R0, [R4, #1]
buffer[2] = *((unsigned char*)(&tempLong)+2);
0x00006D94 4668 MOV R0, SP
0x00006D96 7880 LDRB R0, [R0, #2]
0x00006D98 70A0 STRB R0, [R4, #2]
buffer[3] = *((unsigned char*)(&tempLong)+3);
0x00006D9A 4668 MOV R0, SP
0x00006D9C 78C0 LDRB R0, [R0, #3]
0x00006D9E 70E0 STRB R0, [R4, #3]
buffer[5] = 0x00;
0x00006DA0 2000 MOV R0, #0
0x00006DA2 7160 STRB R0, [R4, #5]
}
0x00006DA4 B001 ADD SP, SP, #4
0x00006DA6 BC10 POP {R4}
0x00006DA8 BC01 POP {R0}
0x00006DAA 4700 BX R0
0x00006DAC 8004 STRH R4, [R0, #0]
0x00006DAE E002 B 0x006DB6
Here is the disassembly from the non-working code (volatile keyword removed)
void AT45_Read(unsigned long addr, unsigned char* buffer, unsigned int len) {
Next label is a Thumb label
AT45_Read:
0x00006D7C B510 PUSH {R4, LR}
0x00006D7E B082 SUB SP, SP, #8
0x00006D80 1C0C MOV R4, R1
tempLong = AT45_DeviceID();
0x00006D82 F7FF ; pre BL/BLX
0x00006D84 FFD9 BL AT45_DeviceID ; 0x6D38
0x00006D86 9001 STR R0, [SP, #4]
buffer[0] = *((unsigned char*)(&tempLong)+0);
0x00006D88 4668 MOV R0, SP
0x00006D8A 7900 LDRB R0, [R0, #4]
0x00006D8C 7020 STRB R0, [R4, #0]
buffer[1] = *((unsigned char*)(&tempLong)+1);
0x00006D8E 4668 MOV R0, SP
0x00006D90 7840 LDRB R0, [R0, #1]
0x00006D92 7060 STRB R0, [R4, #1]
buffer[2] = *((unsigned char*)(&tempLong)+2);
0x00006D94 4668 MOV R0, SP
0x00006D96 7880 LDRB R0, [R0, #2]
0x00006D98 70A0 STRB R0, [R4, #2]
buffer[3] = *((unsigned char*)(&tempLong)+3);
0x00006D9A 4668 MOV R0, SP
0x00006D9C 78C0 LDRB R0, [R0, #3]
0x00006D9E 70E0 STRB R0, [R4, #3]
buffer[5] = 0x00;
0x00006DA0 2000 MOV R0, #0
0x00006DA2 7160 STRB R0, [R4, #5]
}
0x00006DA4 B002 ADD SP, SP, #8
0x00006DA6 BC10 POP {R4}
0x00006DA8 BC01 POP {R0}
0x00006DAA 4700 BX R0
0x00006DAC 8004 STRH R4, [R0, #0]
0x00006DAE E002 B 0x006DB6

In the correct code, SP points to the address 0x40003CBC
In the incorrect code, SP points to the address 0x40003CB8
Makes sense since the incorrect code is subtracting 8, not 4.

Here are the values at the memory:
0x40003CB8 9b 85 00 40
0x40003CBC 1f 26 00 00

The correct code stores 1f 26 00 00 in buffer
The incorrect code stores 1f 85 00 40 in buffer
Makes sense since the incorrect code goes to the stack pointer, and indexes 4 into it, then 1 2 3
And the correct code goes to the stack pointer and indexes 0, then 1, 2, 3

So I'm guessing the compiler is doing something wrong. Maybe I'll let IAR know.

Thanks again for everyone's interest.

>>> On 2007-03-14 at 19:43, d...@ovro.caltech.edu wrote:

>> So my first guess is that this is an endianness issue ...
>
> The LPCs are little endian. Although ARMs in general can be either.

I knew the ARM core could be both, but couldn't remember
which the LPC implemented.

I'm working on FPGA code that interfaces to a big-endian
PowerPC, using little-endian-centric FPGA JTAG debugging tools,
so endianness issues are currently near-and-dear ;)

Cheers
Dave

The 2024 Embedded Online Conference