Beware of GNU-ARM compiler for Cortex-M0/M0+/M1

The popular GNU-ARM toolset has had long-known issues for the Cortex-M0/M0+/M1 (ARMv6-M architecture). Specifically, people have reported very inefficient code generated, see "Cortex M0/M0+/M1/M23 BAD Optimisation in GCC" https://embdev.net/topic/426508 .

But while so far people reported only inefficient code, I would like to make people aware of *incorrect* code generated by GNU-ARM for Cortex-M0/M0+.

The issue was detected with interrupt disabling and has been documented in a bug report for the QP framework, see https://sourceforge.net/p/qpc/bugs/184/ . The experiments performed with the latest available GUN-ARM (GNU Tools for ARM Embedded Processors 6-2017-q2-update, 6.3.1 20170620 release) clearly show incorrect code generated at optimization level -O, while the same code compiled at -O2 level seemed to be correct.

Please be careful with GNU-ARM for ARMv6-M architecture and preferably avoid using it for these CPUs as long as the issue remains unresolved.

Miro Samek
state-machine.com

Reply by Dave Nadler ●October 10, 20172017-10-10

Thanks Miro for bringing this to our attention...

Reply by Tauno Voipio ●October 10, 20172017-10-10

On 10.10.17 17:47, StateMachineCOM wrote:
> The popular GNU-ARM toolset has had long-known issues for the Cortex-M0/M0+/M1 (ARMv6-M architecture). Specifically, people have reported very inefficient code generated, see "Cortex M0/M0+/M1/M23 BAD Optimisation in GCC" https://embdev.net/topic/426508 .
> 
> But while so far people reported only inefficient code, I would like to make people aware of *incorrect* code generated by GNU-ARM for Cortex-M0/M0+.
> 
> The issue was detected with interrupt disabling and has been documented in a bug report for the QP framework, see https://sourceforge.net/p/qpc/bugs/184/ . The experiments performed with the latest available GUN-ARM (GNU Tools for ARM Embedded Processors 6-2017-q2-update, 6.3.1 20170620 release) clearly show incorrect code generated at optimization level -O, while the same code compiled at -O2 level seemed to be correct.
> 
> Please be careful with GNU-ARM for ARMv6-M architecture and preferably avoid using it for these CPUs as long as the issue remains unresolved.
> 
> Miro Samek
> state-machine.com


Do you have an example source snippet?

QP feels pretty heavy for Cortex-M0.

-- 

-TV

Reply by David Brown ●October 10, 20172017-10-10

On 10/10/17 16:47, StateMachineCOM wrote:
> The popular GNU-ARM toolset has had long-known issues for the Cortex-M0/M0+/M1 (ARMv6-M architecture). Specifically, people have reported very inefficient code generated, see "Cortex M0/M0+/M1/M23 BAD Optimisation in GCC" https://embdev.net/topic/426508 .
> 
> But while so far people reported only inefficient code, I would like to make people aware of *incorrect* code generated by GNU-ARM for Cortex-M0/M0+.
> 
> The issue was detected with interrupt disabling and has been documented in a bug report for the QP framework, see https://sourceforge.net/p/qpc/bugs/184/ . The experiments performed with the latest available GUN-ARM (GNU Tools for ARM Embedded Processors 6-2017-q2-update, 6.3.1 20170620 release) clearly show incorrect code generated at optimization level -O, while the same code compiled at -O2 level seemed to be correct.
> 
> Please be careful with GNU-ARM for ARMv6-M architecture and preferably avoid using it for these CPUs as long as the issue remains unresolved.
> 
> Miro Samek
> state-machine.com
> 

It is impossible for anyone to determine if this is a bug in the 
compiler or a bug in the QS macros without giving us the source of the 
test.  Can you give us the source of these macros (or if they are 
proprietary, a roughly equivalent source that shows the same problems)? 
I'd like to see it, and try it on a simple case such as the example in 
the linked page.

void crit_section_test(void) {
     uint32_t i;
     for(i = 0; i < 10; i++) {
         QS_BEGIN(123, 0);
             QS_U32(8, 0);
         QS_END();
     }
}

My guess here is that there is a misunderstanding or error in the 
embedded assembly in these macros.  gcc inline assembly can be a bit 
fiddly to get exactly right.

Reply by Tauno Voipio ●October 10, 20172017-10-10

On 10.10.17 21:55, David Brown wrote:
> On 10/10/17 16:47, StateMachineCOM wrote:
>> The popular GNU-ARM toolset has had long-known issues for the 
>> Cortex-M0/M0+/M1 (ARMv6-M architecture). Specifically, people have 
>> reported very inefficient code generated, see "Cortex M0/M0+/M1/M23 
>> BAD Optimisation in GCC" https://embdev.net/topic/426508 .
>>
>> But while so far people reported only inefficient code, I would like 
>> to make people aware of *incorrect* code generated by GNU-ARM for 
>> Cortex-M0/M0+.
>>
>> The issue was detected with interrupt disabling and has been 
>> documented in a bug report for the QP framework, see 
>> https://sourceforge.net/p/qpc/bugs/184/ . The experiments performed 
>> with the latest available GUN-ARM (GNU Tools for ARM Embedded 
>> Processors 6-2017-q2-update, 6.3.1 20170620 release) clearly show 
>> incorrect code generated at optimization level -O, while the same code 
>> compiled at -O2 level seemed to be correct.
>>
>> Please be careful with GNU-ARM for ARMv6-M architecture and preferably 
>> avoid using it for these CPUs as long as the issue remains unresolved.
>>
>> Miro Samek
>> state-machine.com
>>
> 
> It is impossible for anyone to determine if this is a bug in the 
> compiler or a bug in the QS macros without giving us the source of the 
> test.&nbsp; Can you give us the source of these macros (or if they are 
> proprietary, a roughly equivalent source that shows the same problems)? 
> I'd like to see it, and try it on a simple case such as the example in 
> the linked page.
> 
> void crit_section_test(void) {
>  &nbsp;&nbsp;&nbsp; uint32_t i;
>  &nbsp;&nbsp;&nbsp; for(i = 0; i < 10; i++) {
>  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_BEGIN(123, 0);
>  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_U32(8, 0);
>  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_END();
>  &nbsp;&nbsp;&nbsp; }
> }
> 
> My guess here is that there is a misunderstanding or error in the 
> embedded assembly in these macros.&nbsp; gcc inline assembly can be a bit 
> fiddly to get exactly right.


I just wonder if QP attempts use the exclusive access instruction
pairs (LDREX / STREX), which do not exist in M0 and M1.

-- 

-TV

Reply by David Brown ●October 11, 20172017-10-11

On 10/10/17 21:35, Tauno Voipio wrote:
> On 10.10.17 21:55, David Brown wrote:
>> On 10/10/17 16:47, StateMachineCOM wrote:
>>> The popular GNU-ARM toolset has had long-known issues for the 
>>> Cortex-M0/M0+/M1 (ARMv6-M architecture). Specifically, people have 
>>> reported very inefficient code generated, see "Cortex M0/M0+/M1/M23 
>>> BAD Optimisation in GCC" https://embdev.net/topic/426508 .
>>>
>>> But while so far people reported only inefficient code, I would like 
>>> to make people aware of *incorrect* code generated by GNU-ARM for 
>>> Cortex-M0/M0+.
>>>
>>> The issue was detected with interrupt disabling and has been 
>>> documented in a bug report for the QP framework, see 
>>> https://sourceforge.net/p/qpc/bugs/184/ . The experiments performed 
>>> with the latest available GUN-ARM (GNU Tools for ARM Embedded 
>>> Processors 6-2017-q2-update, 6.3.1 20170620 release) clearly show 
>>> incorrect code generated at optimization level -O, while the same 
>>> code compiled at -O2 level seemed to be correct.
>>>
>>> Please be careful with GNU-ARM for ARMv6-M architecture and 
>>> preferably avoid using it for these CPUs as long as the issue remains 
>>> unresolved.
>>>
>>> Miro Samek
>>> state-machine.com
>>>
>>
>> It is impossible for anyone to determine if this is a bug in the 
>> compiler or a bug in the QS macros without giving us the source of the 
>> test.&nbsp; Can you give us the source of these macros (or if they are 
>> proprietary, a roughly equivalent source that shows the same 
>> problems)? I'd like to see it, and try it on a simple case such as the 
>> example in the linked page.
>>
>> void crit_section_test(void) {
>> &nbsp;&nbsp;&nbsp;&nbsp; uint32_t i;
>> &nbsp;&nbsp;&nbsp;&nbsp; for(i = 0; i < 10; i++) {
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_BEGIN(123, 0);
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_U32(8, 0);
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_END();
>> &nbsp;&nbsp;&nbsp;&nbsp; }
>> }
>>
>> My guess here is that there is a misunderstanding or error in the 
>> embedded assembly in these macros.&nbsp; gcc inline assembly can be a bit 
>> fiddly to get exactly right.
> 
> 
> I just wonder if QP attempts use the exclusive access instruction
> pairs (LDREX / STREX), which do not exist in M0 and M1.
> 

 From the link he gave, there is a screendump of the generated assembly 
- there is no LDREX or STREX there.

My guesses for the problem are missing "volatile" in the asm statements, 
multiple independent asm statements where there should be a single one, 
or incorrect dependency information in the asm statements or other code.

gcc does a lot of optimisation and re-arrangement of code, including 
with inline assembly.  It is easy to get it wrong when you depend on the 
order of the code in a way that the compiler does not know about.

Reply by pozz ●October 11, 20172017-10-11

Il 10/10/2017 20:55, David Brown ha scritto:
> On 10/10/17 16:47, StateMachineCOM wrote:
>> The popular GNU-ARM toolset has had long-known issues for the 
>> Cortex-M0/M0+/M1 (ARMv6-M architecture). Specifically, people have 
>> reported very inefficient code generated, see "Cortex M0/M0+/M1/M23 
>> BAD Optimisation in GCC" https://embdev.net/topic/426508 .
>>
>> But while so far people reported only inefficient code, I would like 
>> to make people aware of *incorrect* code generated by GNU-ARM for 
>> Cortex-M0/M0+.
>>
>> The issue was detected with interrupt disabling and has been 
>> documented in a bug report for the QP framework, see 
>> https://sourceforge.net/p/qpc/bugs/184/ . The experiments performed 
>> with the latest available GUN-ARM (GNU Tools for ARM Embedded 
>> Processors 6-2017-q2-update, 6.3.1 20170620 release) clearly show 
>> incorrect code generated at optimization level -O, while the same code 
>> compiled at -O2 level seemed to be correct.
>>
>> Please be careful with GNU-ARM for ARMv6-M architecture and preferably 
>> avoid using it for these CPUs as long as the issue remains unresolved.
>>
>> Miro Samek
>> state-machine.com
>>
> 
> It is impossible for anyone to determine if this is a bug in the 
> compiler or a bug in the QS macros without giving us the source of the 
> test.&nbsp; Can you give us the source of these macros (or if they are 
> proprietary, a roughly equivalent source that shows the same problems)? 

I think QP/C is open-source project (even if it isn't free-to-use for 
commercial business).

The source code is here:
https://github.com/QuantumLeaps/qpc


> I'd like to see it, and try it on a simple case such as the example in 
> the linked page.
> 
> void crit_section_test(void) {
>  &nbsp;&nbsp;&nbsp; uint32_t i;
>  &nbsp;&nbsp;&nbsp; for(i = 0; i < 10; i++) {
>  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_BEGIN(123, 0);
>  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_U32(8, 0);
>  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_END();
>  &nbsp;&nbsp;&nbsp; }
> }
> 
> My guess here is that there is a misunderstanding or error in the 
> embedded assembly in these macros.&nbsp; gcc inline assembly can be a bit 
> fiddly to get exactly right.
> 

QS_BEGIN and QS_END are defined in include/qs.h, but depends on many 
other macros.

Reply by David Brown ●October 11, 20172017-10-11

On 11/10/17 16:01, pozz wrote:
> Il 10/10/2017 20:55, David Brown ha scritto:
>> On 10/10/17 16:47, StateMachineCOM wrote:
>>> The popular GNU-ARM toolset has had long-known issues for the 
>>> Cortex-M0/M0+/M1 (ARMv6-M architecture). Specifically, people have 
>>> reported very inefficient code generated, see "Cortex M0/M0+/M1/M23 
>>> BAD Optimisation in GCC" https://embdev.net/topic/426508 .
>>>
>>> But while so far people reported only inefficient code, I would like 
>>> to make people aware of *incorrect* code generated by GNU-ARM for 
>>> Cortex-M0/M0+.
>>>
>>> The issue was detected with interrupt disabling and has been 
>>> documented in a bug report for the QP framework, see 
>>> https://sourceforge.net/p/qpc/bugs/184/ . The experiments performed 
>>> with the latest available GUN-ARM (GNU Tools for ARM Embedded 
>>> Processors 6-2017-q2-update, 6.3.1 20170620 release) clearly show 
>>> incorrect code generated at optimization level -O, while the same 
>>> code compiled at -O2 level seemed to be correct.
>>>
>>> Please be careful with GNU-ARM for ARMv6-M architecture and 
>>> preferably avoid using it for these CPUs as long as the issue remains 
>>> unresolved.
>>>
>>> Miro Samek
>>> state-machine.com
>>>
>>
>> It is impossible for anyone to determine if this is a bug in the 
>> compiler or a bug in the QS macros without giving us the source of the 
>> test.&nbsp; Can you give us the source of these macros (or if they are 
>> proprietary, a roughly equivalent source that shows the same problems)? 
> 
> I think QP/C is open-source project (even if it isn't free-to-use for 
> commercial business).
> 
> The source code is here:
> https://github.com/QuantumLeaps/qpc
> 
> 
>> I'd like to see it, and try it on a simple case such as the example in 
>> the linked page.
>>
>> void crit_section_test(void) {
>> &nbsp;&nbsp;&nbsp;&nbsp; uint32_t i;
>> &nbsp;&nbsp;&nbsp;&nbsp; for(i = 0; i < 10; i++) {
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_BEGIN(123, 0);
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_U32(8, 0);
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; QS_END();
>> &nbsp;&nbsp;&nbsp;&nbsp; }
>> }
>>
>> My guess here is that there is a misunderstanding or error in the 
>> embedded assembly in these macros.&nbsp; gcc inline assembly can be a bit 
>> fiddly to get exactly right.
>>
> 
> QS_BEGIN and QS_END are defined in include/qs.h, but depends on many 
> other macros.

Yes, I saw the source was there - but I have no interest in the project, 
and no interest in digging through all the source of that project to try 
to find the problem.  The OP is one of the people behind that project, 
as far as I can see - he should be able to provide a small 
self-contained equivalent definition for the macros so that we can get 
to the bottom of his problem.

My take on this at the moment is that it is most likely to be a flaw in 
the QP code, not the compiler.  I am happy to help, whether it turns out 
to be a compiler problem or a QP problem.

But the OP has to do some work here, not just give a hit-and-run FUD 
about the compiler that is far and away the dominant tool for these 
microcontrollers.  "Avoid using gcc for the M0/M0+" is advice to avoid 
those microcontrollers entirely.

Reply by StateMachineCOM ●October 11, 20172017-10-11

Thank you everyone for attention. There is really no need to be hostile. I'm NOT trying to sell you anything. I merely didn't have the time to distill the problem to be completely "context free".

But I was was able to distill the problem to a relatively small snippet of code without any external dependencies or macros. I filed this information as an official bug report at GCC-ARM-Embedded, please see:

https://bugs.launchpad.net/gcc-arm-embedded/+bug/1722849

As I experimented with this code, the excessive type casting in the condition for the if statement seems to be implicated (the bug goes away if I remove some of this type casting). The type casting has been added in the first place to satisfy static analysis with PC-Lint for MISRA-C compliance.

--MMS

Reply by David Brown ●October 11, 20172017-10-11

On 11/10/17 18:03, StateMachineCOM wrote:
> Thank you everyone for attention. There is really no need to be hostile. I'm NOT trying to sell you anything. I merely didn't have the time to distill the problem to be completely "context free".
> 
> But I was was able to distill the problem to a relatively small snippet of code without any external dependencies or macros. I filed this information as an official bug report at GCC-ARM-Embedded, please see:
> 
> https://bugs.launchpad.net/gcc-arm-embedded/+bug/1722849
> 
> As I experimented with this code, the excessive type casting in the condition for the if statement seems to be implicated (the bug goes away if I remove some of this type casting). The type casting has been added in the first place to satisfy static analysis with PC-Lint for MISRA-C compliance.
> 

No hostility was intended - I just want to make sure that this issue is 
considered properly, and followed up properly.  I have seen too many 
people drop into a newsgroup like this and make claims about compiler 
bugs, then disappear (perhaps in embarrassment) when it is their own 
code that is found faulty.  I want to push you to follow the thread here 
and keep things updated

Thank you for posting the test code (in the launchpad bug report).  I 
can't see anything wrong with the code you wrote so far.  I am a little 
short on time just now (it's dinner time here :-) ) but I will do some 
experiments with the code as soon as I get the chance, and get back to you.

Previous12 3 Next

Beware of GNU-ARM compiler for Cortex-M0/M0+/M1

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group