EmbeddedRelated.com
Forums

Beware of GNU-ARM compiler for Cortex-M0/M0+/M1

Started by StateMachineCOM October 10, 2017
On 05/12/17 21:58, mrfirmware wrote:
> On Thursday, October 12, 2017 at 10:04:55 AM UTC-4, David Brown wrote: >> On 11/10/17 18:51, David Brown wrote: >>> On 11/10/17 18:03, StateMachineCOM wrote: >>>> Thank you everyone for attention. There is really no need to be >>>> hostile. I'm NOT trying to sell you anything. I merely didn't have the >>>> time to distill the problem to be completely "context free". >>>> >>>> But I was was able to distill the problem to a relatively small >>>> snippet of code without any external dependencies or macros. I filed >>>> this information as an official bug report at GCC-ARM-Embedded, please >>>> see: >>>> >>>> https://bugs.launchpad.net/gcc-arm-embedded/+bug/1722849 >>>> >>>> As I experimented with this code, the excessive type casting in the >>>> condition for the if statement seems to be implicated (the bug goes >>>> away if I remove some of this type casting). The type casting has been >>>> added in the first place to satisfy static analysis with PC-Lint for >>>> MISRA-C compliance. >>>> >>> >>> No hostility was intended - I just want to make sure that this issue is >>> considered properly, and followed up properly.  I have seen too many >>> people drop into a newsgroup like this and make claims about compiler >>> bugs, then disappear (perhaps in embarrassment) when it is their own >>> code that is found faulty.  I want to push you to follow the thread here >>> and keep things updated >>> >>> Thank you for posting the test code (in the launchpad bug report).  I >>> can't see anything wrong with the code you wrote so far.  I am a little >>> short on time just now (it's dinner time here :-) ) but I will do some >>> experiments with the code as soon as I get the chance, and get back to you. >> >> I have just added this comment to the gcc-arm-embedded bug report linked >> above: >> >> >> I agree that it is surprising that "asm volatile" statements can be >> re-arranged with respect to each other, and with respect to other >> volatile accesses. This seems to be a particular problem with asm >> statements that return outputs - "volatile" is used primarily to tell >> the compiler that you can get different outputs at different times, even >> if the inputs (if any) are the same. For asm statements with no >> outputs, the compiler appears to assume they have a special function and >> should not be moved. >> >> As far as I know, there is no way in C or as a gcc extension to specify >> ordering of statements or executable code - you can only specify >> ordering of memory (via volatile accesses). Even the traditional method >> of declaring a function (like "foo" in the sample) externally is in >> danger - with link-time optimisation, the compiler knows everything and >> can re-arrange bits of "foo" with respect to the asm statements or >> volatile accesses. > > To prevent compiler reordering I use this inline asm macro. > > __asm__ __volatile("":::"memory"); > > I'm pretty sure that tells gcc that it should not reorder above and > below that statement. So you might sprinkle those about?
No, that only limits the compiler's ability to re-order data reads and writes to memory - it does not affect statements or executable code except indirectly (if code is needed to calculate a value that is written before the memory clobber, then of course that code needs to run before the clobber statement). See the URL I gave a couple of lines down for an example where even asm volatile("" ::: "memory") does not help.
> >> A related problem is well documented for the AVR gcc port: >> >> http://www.nongnu.org/avr-libc/user-manual/optimization.html >> >> There is, however, a solution to all this. (I have told the avr-libc >> folks about it a number of times, but they have not added my solution to >> their webpage. I have given up trying to persuade them.) >> >> >> I have three macros defined that I use in circumstances like these: >> >> #define forceDependency(val) \ >> asm volatile("" :: "" (val) : ) >> >> #define forgetCompilerKnowledge(v) \ >> asm ("" : "+g" (v)) >> >> #define forgetCompilerBlock(start, size) \ >> do { typedef struct { char x[size]; } XS; XS *p = (XS *) start; \ >> asm ("" : "+m" (*p)); \ >> } while (0); >> >> >> >> The first one tells the compiler "I am using "val" here, so you have to >> evaluate it before this point". The second one tells the compiler "I am >> using "val" here, and I might change it, so you have to evaluate it >> before this point, but forget anything you know about it". The third >> one is just another version that can handle data of any size. >> >> >> Putting "forceDependency(status)" after the "mrs" instruction, but >> before the "foo()" call, ensures that the compiler has evaluated >> "status" before calling foo. It makes most sense to put it before the >> "cpsid i" instruction, but that does not appear to be critical. The >> neatest arrangement is to combine it with the cpsid: >> >> uint32_t status; >> asm volatile ("mrs %0, PRIMASK" : "=r" (status) :: ); >> asm volatile ("cpsid i" :: "" (status) :); >> >> foo(); >> >> asm volatile ("msr PRIMASK, %0" :: "r" (status) : ); >
On Tuesday, December 5, 2017 at 6:59:45 PM UTC-5, David Brown wrote:
> On 05/12/17 21:58, mrfirmware wrote: > > On Thursday, October 12, 2017 at 10:04:55 AM UTC-4, David Brown wrote: > >> On 11/10/17 18:51, David Brown wrote: > >>> On 11/10/17 18:03, StateMachineCOM wrote: > >>>> Thank you everyone for attention. There is really no need to be > >>>> hostile. I'm NOT trying to sell you anything. I merely didn't have the > >>>> time to distill the problem to be completely "context free". > >>>> > >>>> But I was was able to distill the problem to a relatively small > >>>> snippet of code without any external dependencies or macros. I filed > >>>> this information as an official bug report at GCC-ARM-Embedded, please > >>>> see: > >>>> > >>>> https://bugs.launchpad.net/gcc-arm-embedded/+bug/1722849 > >>>> > >>>> As I experimented with this code, the excessive type casting in the > >>>> condition for the if statement seems to be implicated (the bug goes > >>>> away if I remove some of this type casting). The type casting has been > >>>> added in the first place to satisfy static analysis with PC-Lint for > >>>> MISRA-C compliance. > >>>> > >>> > >>> No hostility was intended - I just want to make sure that this issue is > >>> considered properly, and followed up properly.  I have seen too many > >>> people drop into a newsgroup like this and make claims about compiler > >>> bugs, then disappear (perhaps in embarrassment) when it is their own > >>> code that is found faulty.  I want to push you to follow the thread here > >>> and keep things updated > >>> > >>> Thank you for posting the test code (in the launchpad bug report).  I > >>> can't see anything wrong with the code you wrote so far.  I am a little > >>> short on time just now (it's dinner time here :-) ) but I will do some > >>> experiments with the code as soon as I get the chance, and get back to you. > >> > >> I have just added this comment to the gcc-arm-embedded bug report linked > >> above: > >> > >> > >> I agree that it is surprising that "asm volatile" statements can be > >> re-arranged with respect to each other, and with respect to other > >> volatile accesses. This seems to be a particular problem with asm > >> statements that return outputs - "volatile" is used primarily to tell > >> the compiler that you can get different outputs at different times, even > >> if the inputs (if any) are the same. For asm statements with no > >> outputs, the compiler appears to assume they have a special function and > >> should not be moved. > >> > >> As far as I know, there is no way in C or as a gcc extension to specify > >> ordering of statements or executable code - you can only specify > >> ordering of memory (via volatile accesses). Even the traditional method > >> of declaring a function (like "foo" in the sample) externally is in > >> danger - with link-time optimisation, the compiler knows everything and > >> can re-arrange bits of "foo" with respect to the asm statements or > >> volatile accesses. > > > > To prevent compiler reordering I use this inline asm macro. > > > > __asm__ __volatile("":::"memory"); > > > > I'm pretty sure that tells gcc that it should not reorder above and > > below that statement. So you might sprinkle those about? > > No, that only limits the compiler's ability to re-order data reads and > writes to memory - it does not affect statements or executable code > except indirectly (if code is needed to calculate a value that is > written before the memory clobber, then of course that code needs to run > before the clobber statement). See the URL I gave a couple of lines > down for an example where even asm volatile("" ::: "memory") does not help. > > > > >> A related problem is well documented for the AVR gcc port: > >> > >> http://www.nongnu.org/avr-libc/user-manual/optimization.html > >> > >> There is, however, a solution to all this. (I have told the avr-libc > >> folks about it a number of times, but they have not added my solution to > >> their webpage. I have given up trying to persuade them.) > >> > >> > >> I have three macros defined that I use in circumstances like these: > >> > >> #define forceDependency(val) \ > >> asm volatile("" :: "" (val) : ) > >> > >> #define forgetCompilerKnowledge(v) \ > >> asm ("" : "+g" (v)) > >> > >> #define forgetCompilerBlock(start, size) \ > >> do { typedef struct { char x[size]; } XS; XS *p = (XS *) start; \ > >> asm ("" : "+m" (*p)); \ > >> } while (0); > >> > >> > >> > >> The first one tells the compiler "I am using "val" here, so you have to > >> evaluate it before this point". The second one tells the compiler "I am > >> using "val" here, and I might change it, so you have to evaluate it > >> before this point, but forget anything you know about it". The third > >> one is just another version that can handle data of any size. > >> > >> > >> Putting "forceDependency(status)" after the "mrs" instruction, but > >> before the "foo()" call, ensures that the compiler has evaluated > >> "status" before calling foo. It makes most sense to put it before the > >> "cpsid i" instruction, but that does not appear to be critical. The > >> neatest arrangement is to combine it with the cpsid: > >> > >> uint32_t status; > >> asm volatile ("mrs %0, PRIMASK" : "=r" (status) :: ); > >> asm volatile ("cpsid i" :: "" (status) :); > >> > >> foo(); > >> > >> asm volatile ("msr PRIMASK, %0" :: "r" (status) : ); > >
Well that makes sense. I'm pretty sure I only use this clobber macro for mem rd/wr barriers so I think I'm okay but still, good to know. Thanks.