EmbeddedRelated.com
Forums

Has anyone seen this GCC compiler behavior before?

Started by mjbcswitzerland 5 years ago9 replieslatest reply 5 years ago188 views

HI All

I had an interesting experience when using some ARM Cortex initialisation code that I have been using for over 10 years. I used it in an i.MX RT development and with IAR (full optimisation for size) it worked as normal, but when I built with GCC
gcc version 8.3.1 20190703 (release) [gcc-8-branch revision 273027] (GNU Tools for Arm Embedded Processors 8-2019-q3-update)
it failed in a rather spectacular way - which I have never experienced before.

First of all, this is the original code.


    VECTOR_TABLE_OFFSET_REG = (unsigned long)RAM_START_ADDRESS;  // position the vector table at the bottom of instruction RAM
    ptrVect = (VECTOR_TABLE *)VECTOR_TABLE_OFFSET_REG;
    ptrVect->ptrNMI           = irq_NMI;
    ptrVect->ptrHardFault     = irq_hard_fault;
    ptrVect->ptrMemManagement = irq_memory_man;
    ptrVect->ptrBusFault      = irq_bus_fault;
    ptrVect->ptrUsageFault    = irq_usage_fault;
    ptrVect->ptrDebugMonitor  = irq_debug_monitor;
    ptrVect->ptrSysTick       = irq_default;
    fnSubroutineCall(); // any subroutine as reference


The code sets the vector table offset register and fills out a few default interrupt handlers, whereby I follow the code with some random call for illustration purposes. The interrupt vectors are located at the start of SRAM, which tends to be at 0x20000000, or in that area in a lot of cases for Cortex processors.

The failure I had when building for an i.MX RT processor with GCC was that all of this code was completely optimised away, including the subroutine call. However it DIDN'T do this if I located the vectors in data RAM (tightly coupled, optimised for data access) but it DID when I located them in instruction RAM (tightly coupled, optimised for instructions).

A workaround to stop it doing this was to ensure that the vector table offset register is declared as a volatile register (in which case both work normally), rather than without a volatile attribute (in which can data ram location causes the complete following code to be removed - I found the following subroutine thrown in to the "discarded input sections" which I believe means that the linker considered it as "dead-wood" and threw it away).

I found this such an interesting case I wondered whether others have seen the same behavior and also wondered also how many people can quickly identify the reasons for the behavior???
One hint is that the i.MX RT's data ram (DTCM) is located stating at 0x20000000 in the memory map and its instruction RAM (ITCM) is located starting at 0x00000000.

Any takers?

Regards

Mark
[uTasker project developer for Kinetis and i.MX RT]








[ - ]
Reply by jms_nhJanuary 19, 2020

99% chance I can explain, 1% it's something completely different.

RAM_START_ADDRESS is zero, correct?

You're running into undefined behavior, because the value 0 is a null pointer, and reading or writing to the content of a null pointer is undefined behavior, which the compiler is allowed to assume will never occur.

See this snippet on Compiler Explorer: try1() never stores values 13 or 42 and never calls whatever() -- (and never even returns from the function! it writes value 0 to address 0 and if that succeeds, it just keeps on executing), but the other two do.

https://godbolt.org/z/mcQ9dq

#include <stdint.h>
struct banzai
{
    uint16_t foo;
    uint16_t bar;
};
#define IVTADDR     0
#define NOTIVTADDR  0x1000
void whatever(void);
inline static void banzai_init(volatile struct banzai *pb)
{
    pb->foo = 13;
    pb->bar = 42;
    whatever();
}
void try1(void)
{
    uint32_t ivtaddr = IVTADDR;
    volatile struct banzai *pb = (volatile struct banzai *)ivtaddr;
    banzai_init(pb);
}
void try2(void)
{
    uint32_t ivtaddr = NOTIVTADDR;
    volatile struct banzai *pb = (volatile struct banzai *)ivtaddr;
    banzai_init(pb);
}
// let the linker locate this at zero
volatile struct banzai banzai0;
void try3(void)
{
    banzai_init(&banzai0);
}
[ - ]
Reply by KocsonyaJanuary 19, 2020

Yes, you are completely correct.

A lot of people does not seem to know how positively evil the C standard is regarding undefined behaviour.

Undefined behaviour basically means that the compiler (and the code at run-time) can do absolutely anything *and* the compiler does not need to issue a warning that it removed code, changed code, inserted a format all disks call into your code or anything.

Plus, undefined behaviour can be triggered by seemingly innocent things, like shifting an integer that happens to be negative to the left. Or dereference a NULL pointer. Or shift an integer by more than the number of bits in it. Or cause an integer arithmetic overflow.

The list of things that trigger undefined behaviour is some 14 pages in the C standard. Many items on that list are things that embedded engineers would not bat an eyelid about. Some items are pretty much against common sense except if you consider that the C standard wants to also serve machines that have 13 bit words that are stored in a randomised bit pattern form, using a signum bit and an unsigned value in BCD for numbers and 7 bit characters encoded using the Commodore-64 character set, and the NULL pointer is indicated by the bit pattern 1101011101011.

Gcc tries to optimise the code to no end. So it assumes that your code never encounters an undefined behaviour trigger. It is allowed to do it: if your code doesn't, then it was right, if it does, then it had the right to do whatever it wanted. And when it sees a program fragment that is obviously an U.B. trigger, it simply removes it, for it can rightfully do so.

[ - ]
Reply by KocsonyaJanuary 19, 2020

Post deleted by author

[ - ]
Reply by mjbcswitzerlandJanuary 19, 2020

Yes, that is also my conclusion.

If the value 0 (which is legal for vector table offset and the physical RAM available) is known to the compiler it discards the code and (depending on optimising level) all following code without any warning.

If I set the 0 value but declare the register where it was written to as volatile the compiler can no longer assume that it knows that it is a null pointer and includes the code. And all works.

So I learned that if a NULL pointer is actually used (and the GCC compiler knows about it at build time) it silently ignores the input code. IAR doesn't - it still builds it. And it is presumably due to the specification allowing the compiler to do what it wants in this case.

Since instruction code is located at 0x00000000 in the i.MX RT and the vector table is also recommended to be located there for optimal performance one actually has to write code who's behavior is "undefined" according to the C standard if accessed in this fashion....!!!! A NULL pointer is - in this code's sense - a "valid" memory pointer.

My solution is to hide the fact that a NULL pointer is being used (with the volatile register) so that the compiler is not allowed to "assume" anything and generates the code as it would do for any 'legal' pointer. This in the belief that no compiler would ever be able to handle it as an undefined case, thus resulting in the required operation [the code remains portable for initialising interrupt vectors of any Cortex processor]. On top of that, add a few lines of comment carefully explaining the reason for passing the NULL pointer through a volatile register.

Regards

Mark


[ - ]
Reply by rmilneJanuary 19, 2020
[ - ]
Reply by mjbcswitzerlandJanuary 19, 2020

Thanks - I read the two parts.
There was a section about NULL pointer checks but nothing about NULL pointer compiling (also in that example the compiler couldn't predict that such was used)

[ - ]
Reply by mr_banditJanuary 19, 2020

You were probably using -O1 or -O2 flag.

C compilers can be notorious for optimizing away code it thinks is not needed. I think Jack Ganssle has a post on this

Try -O0 with your failing code && see what happens

[ - ]
Reply by mjbcswitzerlandJanuary 19, 2020

I checked

-O0 - fails (but doesn't kick the following call out)
-O1 - fails (but doesn't kick the following call out)
-O2 - fails (and kicks the following call out)
-Os - fails (and kicks the following call out)

The project uses -Os as default.

[ - ]
Reply by indigoredsterJanuary 19, 2020

Compiler confusion with a null pointer and value  of 0x00000000