EmbeddedRelated.com
Forums

Cortex M3 + GCC 4.1.1 + Rowley Crossworks = Strange optimization results

Started by Darcy February 26, 2009
Hi all,

Okay, I know this is the LPC2000 forums but there still aren't any good
Cortex forums... plus the Rowley guy regularly trawl this forum :)

This issue may affect anyone using GCC and ARM7s as well so I don't feel
too guilty for posting here.
The scenario is that during power-on the Cortex M3 waits for its
oscillator to stabilise. Once done, off we go. If it fails then it
starts running on the internal oscillator. There is a particular
function in one of the library files that suffers severe problems when
optimised - resulting in the processor running on the internal
oscillator instead.

The function is found in stm32f10x_rcc.c and looks something like
this...

FlagStatus RCC_GetFlagStatus(u8 RCC_FLAG)
{
u32 tmp = 0; //Set to volatile to fix this problem
u32 statusreg = 0;
FlagStatus bitstatus = RESET;

/* Check the parameters */
assert_param(IS_RCC_FLAG(RCC_FLAG));

/* Get the RCC register index */
tmp = RCC_FLAG >> 5;

if (tmp == 1) /* The flag to check is in CR register
*/
{
statusreg = RCC->CR;
}
else if (tmp == 2) /* The flag to check is in BDCR register
*/
{
statusreg = RCC->BDCR;
}
else /* The flag to check is in CSR register
*/
{
statusreg = RCC->CSR;
}

/* Get the flag position */
tmp = RCC_FLAG & FLAG_Mask;

if ((statusreg & ((u32)1 << tmp)) != (u32)RESET)
{
bitstatus = SET;
}
else
{
bitstatus = RESET;
}

/* Return the flag status */
return bitstatus;
}

The problem occurs specifically with the "tmp" variable. If "tmp" is
not declared as volatile then the final if condition will ALWAYS set
bitstatus = RESET. RCC_FLAG is always passed in as 0x31

This seems to us like a compiler error. I would never have thought
setting "tmp" as volatile would fix this as I certainly didn't think it
would cause a problem in the first place.

I'm wondering if this has something to do with the Thumb2 instruction
set?

The version of GCC currently provided with Rowley CrossStudio for ARM is
4.1.1

This problem was also reported on the ST Micro Cortex M3 forums but no
one responded Link Here

Hopefully someone might have an idea as to what's going on here. I
don't really want to have to modify the libraries just to fix this one
small problem as I can't help but be worried as to what else is going on
in all the other library files!

Cheers
Darcy



An Engineer's Guide to the LPC2100 Series

Hi all,

>
> Okay, I know this is the LPC2000 forums but there still aren't any
good
> Cortex forums... plus the Rowley guy regularly trawl this forum :)

How about these 2?

http://tech.groups.yahoo.com/group/stm32/
or
http://tech.groups.yahoo.com/group/Cortex-M3/

>
> This issue may affect anyone using GCC and ARM7s as well so I don't
feel
> too guilty for posting here.
> The scenario is that during power-on the Cortex M3 waits for its
> oscillator to stabilise. Once done, off we go. If it fails then it
> starts running on the internal oscillator. There is a particular
> function in one of the library files that suffers severe problems
when
> optimised - resulting in the processor running on the internal
> oscillator instead.
>

What optimization level are you using?

> The function is found in stm32f10x_rcc.c and looks something like
> this...

Q:

Any chance this call/macro is doing something to the RCC_FLAG?

assert_param(IS_RCC_FLAG(RCC_FLAG));

If not then code that sets bitstatus looks OK too

if ((statusreg & ((u32)1 << tmp)) != (u32)RESET)
{
bitstatus = SET;
}
else
{
bitstatus = RESET;
}

/* Return the flag status */
return bitstatus;

So it could be opitmization problem fixed by declaring tmp
volatile.

Anyway, the best way to diagnose or confirm this problem would be
to compare assembly output from the compiler.

Can you post the assembly for both cases?

>
> FlagStatus RCC_GetFlagStatus(u8 RCC_FLAG)
> {
> u32 tmp = 0; //Set to volatile to fix this problem
> u32 statusreg = 0;
> FlagStatus bitstatus = RESET;
>
> /* Check the parameters */
> assert_param(IS_RCC_FLAG(RCC_FLAG));
>
> /* Get the RCC register index */
> tmp = RCC_FLAG >> 5;
>
> if (tmp == 1) /* The flag to check is in CR
register
> */
> {
> statusreg = RCC->CR;
> }
> else if (tmp == 2) /* The flag to check is in BDCR
register
> */
> {
> statusreg = RCC->BDCR;
> }
> else /* The flag to check is in CSR
register
> */
> {
> statusreg = RCC->CSR;
> }
>
> /* Get the flag position */
> tmp = RCC_FLAG & FLAG_Mask;
>
> if ((statusreg & ((u32)1 << tmp)) != (u32)RESET)
> {
> bitstatus = SET;
> }
> else
> {
> bitstatus = RESET;
> }
>
> /* Return the flag status */
> return bitstatus;
> }
>
> The problem occurs specifically with the "tmp" variable. If "tmp"
is
> not declared as volatile then the final if condition will ALWAYS set
> bitstatus = RESET. RCC_FLAG is always passed in as 0x31
>
> This seems to us like a compiler error. I would never have thought
> setting "tmp" as volatile would fix this as I certainly didn't
think it
> would cause a problem in the first place.
>
> I'm wondering if this has something to do with the Thumb2
instruction
> set?
>
> The version of GCC currently provided with Rowley CrossStudio for
ARM is
> 4.1.1
>
> This problem was also reported on the ST Micro Cortex M3 forums but
no
> one responded Link Here cat-7515-23.html> Hopefully someone might have an idea as to what's going on here. I
> don't really want to have to modify the libraries just to fix this
one
> small problem as I can't help but be worried as to what else is
going on
> in all the other library files!
>
> Cheers
> Darcy
>
>
>

Cheers,
Pawel

Hi, thanks for the reply

> How about these 2?
>
> http://tech.groups.yahoo.com/group/stm32/
> or
> http://tech.groups.yahoo.com/group/Cortex-M3/

Each of those groups has less than 100 users and posts are only made
every week or two. I've had them both on RSS feeds for a couple of
months and there is very little activity. The ST Micro forum is much
better but doesn't support RSS feeds which makes it difficult to keep
up to date with what's going on.

> What optimization level are you using?

Level 1

> Any chance this call/macro is doing something to the RCC_FLAG?
>
> assert_param(IS_RCC_FLAG(RCC_FLAG));

Nope, this one is fine. It's removed when compiled in release mode.

> Anyway, the best way to diagnose or confirm this problem would be
> to compare assembly output from the compiler.
>
> Can you post the assembly for both cases?

The non-optimised version is considerably larger... so I've just
dumped the comparisons between vol and non-vol.
Optimization - Level 1 - Non Volatile
====================================================================FlagStatus RCC_GetFlagStatus(u8 RCC_FLAG)
{
B2C0 uxtb r0, r0
/* Check the parameters */
assert_param(IS_RCC_FLAG(RCC_FLAG));
/* Get the RCC register index */
tmp = RCC_FLAG >> 5;
EA4F1350 mov.w r3, r0, lsr #0x05
if (tmp == 1) /* The flag to check is in CR register */
2B01 cmp r3, #0x01
{
statusreg = RCC->CR;
BF04 itt eq
4B08 ldreq r3, [pc, #0x020]
681B ldreq r3, [r3, #0x00]
if (tmp == 1) /* The flag to check is in CR register */
D005 beq 0x08007390
{
statusreg = RCC->CR;
}
else if (tmp == 2) /* The flag to check is in BDCR register */
2B02 cmp r3, #0x02
{
statusreg = RCC->BDCR;
BF07 ittee eq
4B05 ldreq r3, [pc, #0x014]
6A1B ldreq r3, [r3, #0x20]
}
else /* The flag to check is in CSR register */
{
statusreg = RCC->CSR;
4B04 ldrne r3, [pc, #0x010]
6A5B ldrne r3, [r3, #0x24]
F000001F and r0, r0, #0x0000001F
FA23F000 lsr.w r0, r3, r0
F0000001 and r0, r0, #0x00000001
}
/* Return the flag status */
return bitstatus;
}
4770 bx lr
0000 lsls r0, r0, #0x00
1000 asrs r0, r0, #0x00
4002 ands r2, r0
{
B500 push {lr}

Optimization - Level 1 - Volatile
====================================================================FlagStatus RCC_GetFlagStatus(u8 RCC_FLAG)
{
B081 sub sp, #0x004
B2C0 uxtb r0, r0
vu32 tmp = 0;
F04F0300 mov.w r3, #0x00000000
9300 str r3, [sp, #0x000]
/* Check the parameters */
assert_param(IS_RCC_FLAG(RCC_FLAG));
/* Get the RCC register index */
tmp = RCC_FLAG >> 5;
EA4F1350 mov.w r3, r0, lsr #0x05
9300 str r3, [sp, #0x000]
if (tmp == 1) /* The flag to check is in CR register */
9B00 ldr r3, [sp, #0x000]
2B01 cmp r3, #0x01
{
statusreg = RCC->CR;
BF04 itt eq
4B0A ldreq r3, [pc, #0x028]
681A ldreq r2, [r3, #0x00]
if (tmp == 1) /* The flag to check is in CR register */
D006 beq 0x0800739E
{
statusreg = RCC->CR;
}
else if (tmp == 2) /* The flag to check is in BDCR register */
9B00 ldr r3, [sp, #0x000]
2B02 cmp r3, #0x02
{
statusreg = RCC->BDCR;
BF07 ittee eq
4B07 ldreq r3, [pc, #0x01C]
6A1A ldreq r2, [r3, #0x20]
}
else /* The flag to check is in CSR register */
{
statusreg = RCC->CSR;
4B06 ldrne r3, [pc, #0x018]
6A5A ldrne r2, [r3, #0x24]
}
/* Get the flag position */
tmp = RCC_FLAG & FLAG_Mask;
F000031F and r3, r0, #0x0000001F
9300 str r3, [sp, #0x000]
if ((statusreg & ((u32)1 << tmp)) != (u32)RESET)
9800 ldr r0, [sp, #0x000]
FA22F000 lsr.w r0, r2, r0
F0000001 and r0, r0, #0x00000001
}
/* Return the flag status */
return bitstatus;
}
B001 add sp, #0x004
4770 bx lr
0000 lsls r0, r0, #0x00
1000 asrs r0, r0, #0x00
4002 ands r2, r0
{
B530 push {r4-r5, lr}

--- In l..., "Darcy" wrote:
>
> Hi all,
>
> Okay, I know this is the LPC2000 forums but there still aren't any good
> Cortex forums... plus the Rowley guy regularly trawl this forum :)
>
> This issue may affect anyone using GCC and ARM7s as well so I don't feel
> too guilty for posting here.
> The scenario is that during power-on the Cortex M3 waits for its
> oscillator to stabilise. Once done, off we go. If it fails then it
> starts running on the internal oscillator. There is a particular
> function in one of the library files that suffers severe problems when
> optimised - resulting in the processor running on the internal
> oscillator instead.
>
> The function is found in stm32f10x_rcc.c and looks something like
> this...
>
> FlagStatus RCC_GetFlagStatus(u8 RCC_FLAG)
> {
> u32 tmp = 0; //Set to volatile to fix this problem
> u32 statusreg = 0;
> FlagStatus bitstatus = RESET;
>
> /* Check the parameters */
> assert_param(IS_RCC_FLAG(RCC_FLAG));
>
> /* Get the RCC register index */
> tmp = RCC_FLAG >> 5;
>
> if (tmp == 1) /* The flag to check is in CR register
> */
> {
> statusreg = RCC->CR;
> }
> else if (tmp == 2) /* The flag to check is in BDCR
register
> */
> {
> statusreg = RCC->BDCR;
> }
> else /* The flag to check is in CSR register
> */
> {
> statusreg = RCC->CSR;
> }
>
> /* Get the flag position */
> tmp = RCC_FLAG & FLAG_Mask;
>
> if ((statusreg & ((u32)1 << tmp)) != (u32)RESET)
> {
> bitstatus = SET;
> }
> else
> {
> bitstatus = RESET;
> }
>
> /* Return the flag status */
> return bitstatus;
> }
>
> The problem occurs specifically with the "tmp" variable. If "tmp" is
> not declared as volatile then the final if condition will ALWAYS set
> bitstatus = RESET. RCC_FLAG is always passed in as 0x31
>
> This seems to us like a compiler error. I would never have thought
> setting "tmp" as volatile would fix this as I certainly didn't think it
> would cause a problem in the first place.
>
> I'm wondering if this has something to do with the Thumb2 instruction
> set?
>
> The version of GCC currently provided with Rowley CrossStudio for ARM is
> 4.1.1
>
> This problem was also reported on the ST Micro Cortex M3 forums but no
> one responded Link Here Hopefully someone might have an idea as to what's going on here. I
> don't really want to have to modify the libraries just to fix this one
> small problem as I can't help but be worried as to what else is going on
> in all the other library files!
>
> Cheers
> Darcy
>
>
>

I don't want to enter in the details of your specific problem since I
don't know the Cortex M3 architecture and I don't understand what you
really want to do with your procedure.
But I don't think your problem is related to Cortex, the thumb 2
instruction set or a compiler error.
Any new generation compiler that is able to perform some kind of
optimization will not read a value from a memory address if it has
already the same value in a processor register from a precedent operation.
For example:

y = a + 1; // 'a' COULD be read in a processor register.
z = a + 3; // 'a' is not read from memory (probably).

If the 'a' variable changes between the two statements, you can get
unexpected results.
Usually a variable changes because:
1) there was an interrupt between the two statements and the ISR
changed the variable.
2) the variable represent a register in an hardware device, that can
change asynchronously to your program flow.

In both case the solution are:
a) turn off any compiler optimization.
or
b) carefully determine what variables need to be declared volatile.

If you don't want to touch the original libraries, you can rewrite and
correct only the buggy function and use your version.

I've been trying to figure out the source of this problem also. As an
experiment, I simplified the routine in question, as much as I could
(by removing its generality), down to this :
static FlagStatus MyGetStatus( void )
{
return ( RCC->CR & 0x00020000 ) ? SET : RESET;
}
This version exhibits the same problem - HSE lock is detected in DEBUG
mode, and in RELEASE mode with no optimization. HSE lock is not
detected in RELEASE mode with any optimization level except "none".

I am assuming at this point that it's a difference in the way these
registers are defined in DEBUG vs RELEASE mode, but I don't fully
understand the differences yet.

Seems to me that the volatile-ness of the RCC register is not being
honored, in RELEASE mode, but I don't know why.

Also - since Rowley has been mentioned in this thread, I think it's
only fair to say that this is definitely a FWLib problem, and NOT a
Rowley problem. If I replace my "simple" function with this one (based
on Rowley's method of register definition), it works fine.
#define RCC_CR (*(volatile unsigned long *)0x40021000)

static FlagStatus MyGetStatus( void )
{
return ( RCC_CR & 0x00020000 ) ? SET : RESET;
//return ( RCC->CR & 0x00020000 ) ? SET : RESET;
}

Hi Brian,

Smells a bit like the volatile on the field of the struct is getting
lost. Could you try making the struct volatile

typedef volatile struct
{
vu32 CR;
....
vu32 CSR;
} RCC_TypeDef;

Regards
Michael

>
> Also - since Rowley has been mentioned in this thread, I think it's
> only fair to say that this is definitely a FWLib problem, and NOT a
> Rowley problem. If I replace my "simple" function with this one (based
> on Rowley's method of register definition), it works fine.
> #define RCC_CR (*(volatile unsigned long *)0x40021000)
>
> static FlagStatus MyGetStatus( void )
> {
> return ( RCC_CR & 0x00020000 ) ? SET : RESET;
> //return ( RCC->CR & 0x00020000 ) ? SET : RESET;
> }
>

--- In l..., "nutleycottage" wrote:
>
> Hi Brian,
>
> Smells a bit like the volatile on the field of the struct is getting
> lost. Could you try making the struct volatile
>
> typedef volatile struct
> {
> vu32 CR;
> ....
> vu32 CSR;
> } RCC_TypeDef;
>
> Regards
> Michael
>

This change works, too.

Certainly looks like the compiler is ignoring the volatile keyword on
struct *members*, which seems to me to be a compiler error, unless I'm
missing something obvious.

--- In l..., "brian_myers888"
wrote:
>
> --- In l..., "nutleycottage" wrote:
> >
> > Hi Brian,
> >
> > Smells a bit like the volatile on the field of the struct is getting
> > lost. Could you try making the struct volatile
> >
> > typedef volatile struct
> > {
> > vu32 CR;
> > ....
> > vu32 CSR;
> > } RCC_TypeDef;
> >
> > Regards
> > Michael
> > This change works, too.
>
> Certainly looks like the compiler is ignoring the volatile keyword on
> struct *members*, which seems to me to be a compiler error, unless I'm
> missing something obvious.
>

I can add to this suspicion. I had problems with my CAN interface
when accessing volatile structs and their members that mapped to the
CAN register map. I cannot remember the symptoms exactly, but I ended
up adding 'volatile' to all access methods and struct pointers.

Rgds,
Martin