This question came up in the thread titled "Re: Ada, was: Re: Fundamental C question about "if" statements". I felt it deserved its own thread, because it's far off topic to what's become a very interesting, but far off topic thread. I, for one, would really like to see one of our resident compiler experts answer the question at the bottom. On Wed, 23 Sep 2015 19:36:57 +0000, Simon Clubley wrote:> On 2015-09-23, Tim Wescott <seemywebsite@myfooter.really> wrote: >> On Wed, 23 Sep 2015 18:36:43 +0000, Simon Clubley wrote: >>> Good luck doing that in a reliable way on ARM with gcc when you are >>> accessing device registers. >>> >>> If you access a bitfield in the lower 8 bits of a register, the gcc >>> optimiser can turn that into 8-bit ldrb/strb opcodes instead of 32-bit >>> ldr/str opcodes and thereby causing the program to break if you need >>> to access your registers in units of greater than 8 bits. >> >> That's a very interesting comment, because that's basically How It Is >> Done here, and I've had several years worth of success on both ST and >> TI/ >> Luminary devices. I'm not sure if that's because of the way I define >> my bitfields (absolutely everything is defined as a struct of >> bitfields, then a union of that struct and a 32-bit integer), or if >> I've been lucky, >> or what. >> >> > Do your MCUs _require_ register access in units of 16/32 bits ? > > If so, you may be doing something which works for now, but may not work > in the future. > > It might be worth having a quick look with objdump just to be sure. > > I have personally had this happen to me and it broke my code due to the > register access size constraints no longer been true. > >> So it looks like: >> >> struct SPeriphRegisterBits { >> unsigned int bit0 : 1; unsigned int bits1_10 : 10; unsigned >> int bit11 : 1; unsigned int : 16; unsigned int >> bits28_31 : 4; >> }; >> >> union UPerihRegister { >> struct SPeriphRegisterBits bits; >> unsigned int all; >> } >> >> All of the various register definitions then get collected into a >> structure for the peripheral, which gets declared as "extern const", >> and defined in the linker command file. >> >> > This is a bit more involved than what I was doing. I wonder if this is > what is saving you for now.Yes, my MCU's require access in units of 16 or 32 bits, at least in places -- at least, if the ST data book is to be believed that's the case. I would REALLY like one of the compiler guys to weigh in on this, because I'm wondering if collecting all those unions into a struct and calling it volatile isn't forcing a 32-bit access. It would be nice to know if I'm just bamboozling the optimizer into doing what I want, or if I'm actually _telling_ it to do the right thing. So, the final definitions (that I left out) is struct SPeripheralRegs { UPeriphRegister1 REG1; UPeriphRegister2 REG2; -- etc -- }; extern volatile SPeripheralRegs THIS_PERIPHERAL; Then I access things by THIS_PERIPHERAL.REGISTER.bits.whatever = something; Is (oh compiler guys!!!) this directing the compiler to act as if I wrote: UPeriphRegister A; A.all = static_cast<volatile unsigned int>THIS_PERIPHERAL.REGISTER.all; A.bits.whatever = something; static_cast<volatile unsigned int>THIS_PERIPHERAL.REGISTER.all = A.all; I could see how it might do this by the intent of the language, but I could see that I may just be bamboozling the optimizer, rather than intelligently directing it. So -- I'm curious, in a more-than-idly sort of way. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
Use of volatile in structure definitions to force word accesses
Started by ●September 23, 2015
Reply by ●September 23, 20152015-09-23
On 2015-09-23, Tim Wescott <seemywebsite@myfooter.really> wrote:> > I would REALLY like one of the compiler guys to weigh in on this, because > I'm wondering if collecting all those unions into a struct and calling it > volatile isn't forcing a 32-bit access. It would be nice to know if I'm > just bamboozling the optimizer into doing what I want, or if I'm actually > _telling_ it to do the right thing. >I've duplicated this using a variant of your structs and caused gcc to generate invalid code for accessing registers. I wasn't able to duplicate the range of cases I thought I could, but I found this in one of my header files I created some years ago: * * Using bitfield structs versus bitmasks to access register bitfields: * My original plan was to use bitfield structs to access the register * bitfields. However, gcc is generating code to access the bitfields * at a byte, and not longword, level. As the MMIO fields are longword * access only, this results in invalid code been generated. * * Marking the enclosing 32-bit struct, instead of the underlying bitfields, * as volatile results in code been correctly generated for most cases. * However, when the field to be modified is a byte in the least significant * 8 bits of the register, then gcc still generates a byte level store * even though it performed a longword level read. There may also be other * cases which I have not yet come across. * * As a result, I have switched to using bitmasks to access bitfields in a * register. * so it's clear that _where_ you place the volatile attribute is critical to how the generated code looks (and how broken it is). A longword is 32 bits BTW; that's my DEC background showing. :-) Here's a self contained example: ============================================================================ /* * Compiled with: * * arm-none-eabi-gcc -g -c -O2 -mcpu=arm926ej-s -o tim.o tim.c */ struct SPeriphRegisterBits { // unsigned int bit0 : 1; // unsigned int bits1_10 : 10; unsigned int bit0 : 8; unsigned int bits1_10 : 3; unsigned int bit11 : 1; unsigned int : 16; unsigned int bits28_31 : 4; }; union UPerihRegister { struct SPeriphRegisterBits bits; unsigned int all; }; struct SPeripheralRegs { union UPerihRegister REG1; union UPerihRegister REG2; }; //extern volatile struct SPeripheralRegs THIS_PERIPHERAL; // //Then I access things by // //THIS_PERIPHERAL.REGISTER.bits.whatever = something; volatile union UPerihRegister *ur1 = (union UPerihRegister *) 0x80001000; volatile struct SPeriphRegisterBits *sr1 = (struct SPeriphRegisterBits *) 0x80001000; void tim_test(void) { unsigned int val; // val = ur1->bits.bit0; // ur1->bits.bit0 = 1; val = sr1->bit0; sr1->bit0 = sr1->bit0 + 3; THIS_PERIPHERAL.REG1.bits.bit0 = 3; return; } ============================================================================ and here's the objdump output: ============================================================================ tim.o: file format elf32-littlearm Disassembly of section .text: 00000000 <tim_test>: unsigned int val; // val = ur1->bits.bit0; // ur1->bits.bit0 = 1; val = sr1->bit0; 0: e59f3024 ldr r3, [pc, #36] ; 2c <tim_test+0x2c> 4: e5933000 ldr r3, [r3] 8: e5d32000 ldrb r2, [r3] sr1->bit0 = sr1->bit0 + 3; c: e5d32000 ldrb r2, [r3] 10: e2822003 add r2, r2, #3 14: e20220ff and r2, r2, #255 ; 0xff 18: e5c32000 strb r2, [r3] THIS_PERIPHERAL.REG1.bits.bit0 = 3; 1c: e59f300c ldr r3, [pc, #12] ; 30 <tim_test+0x30> 20: e3a02003 mov r2, #3 24: e5c32000 strb r2, [r3] return; } 28: e12fff1e bx lr ... ============================================================================ Note the use of ldrb and strb instead of ldr and str in the generated code. So it's confirmed that having a 8-bit bitfield in the LSBs of the struct causes the generated code to be broken, at least in the version of gcc I am using. My notes from a few years ago reflect my memory that I was able to generate broken code for other cases as well, depending on exactly what was marked as volatile (which admittedly was a factor I had forgotten). You don't happen to have any structs with the lower 8 bits being a single bitfield by any chance do you ? It would be interesting to see what objdump says in your case. Simon. -- Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world
Reply by ●September 23, 20152015-09-23
On 2015-09-23, Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:> > I've duplicated this using a variant of your structs and caused gcc to > generate invalid code for accessing registers. >I've now got gcc to generate bad code (at least as far as accessing registers are concerned) whenever the bitfield is an 8-bit bitfield which lies on a byte boundary. A variant of the first example: ============================================================================ /* * Compiled with: * * arm-none-eabi-gcc -g -c -O2 -mcpu=arm926ej-s -o tim.o tim.c */ struct SPeriphRegisterBits { // unsigned int bit0 : 1; // unsigned int bits1_10 : 10; unsigned int : 8; unsigned int : 8; unsigned int bits1_10 : 3; unsigned int bit11 : 1; unsigned int bits28_31 : 4; unsigned int bit0 : 8; }; union UPerihRegister { struct SPeriphRegisterBits bits; unsigned int all; }; struct SPeripheralRegs { union UPerihRegister REG1; union UPerihRegister REG2; }; //extern volatile struct SPeripheralRegs THIS_PERIPHERAL; // //Then I access things by // //THIS_PERIPHERAL.REGISTER.bits.whatever = something; volatile union UPerihRegister *ur1 = (union UPerihRegister *) 0x80001000; volatile struct SPeriphRegisterBits *sr1 = (struct SPeriphRegisterBits *) 0x80001000; void tim_test(void) { unsigned int val; // val = ur1->bits.bit0; // ur1->bits.bit0 = 1; val = sr1->bit0; sr1->bit0 = sr1->bit0 + 3; THIS_PERIPHERAL.REG1.bits.bit0 = 3; return; } ============================================================================ and the code it generates: ============================================================================ tim.o: file format elf32-littlearm Disassembly of section .text: 00000000 <tim_test>: unsigned int val; // val = ur1->bits.bit0; // ur1->bits.bit0 = 1; val = sr1->bit0; 0: e59f3024 ldr r3, [pc, #36] ; 2c <tim_test+0x2c> 4: e5933000 ldr r3, [r3] 8: e5d32003 ldrb r2, [r3, #3] sr1->bit0 = sr1->bit0 + 3; c: e5d32003 ldrb r2, [r3, #3] 10: e2822003 add r2, r2, #3 14: e20220ff and r2, r2, #255 ; 0xff 18: e5c32003 strb r2, [r3, #3] THIS_PERIPHERAL.REG1.bits.bit0 = 3; 1c: e59f300c ldr r3, [pc, #12] ; 30 <tim_test+0x30> 20: e3a02003 mov r2, #3 24: e5c32003 strb r2, [r3, #3] return; } 28: e12fff1e bx lr ... ============================================================================ Do any of your structs have any 8-bit bitfields which are on a byte boundary ? It would be interesting to see what gcc is doing in your case. The cross compiler I am using is gcc 4.5.{something} although, based on the other problem reports I have come across, I suspect it's still a problem in at least some newer versions as well. Simon. -- Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world
Reply by ●September 23, 20152015-09-23
Tim Wescott wrote: <snip>> I could see how it might do this by the intent of the language, but I > could see that I may just be bamboozling the optimizer, rather than > intelligently directing it. So -- I'm curious, in a more-than-idly sort > of way. >objdump is faster than we are at this :) -- Les Cargill
Reply by ●September 24, 20152015-09-24
On 23/09/15 23:54, Simon Clubley wrote:> On 2015-09-23, Tim Wescott <seemywebsite@myfooter.really> wrote: >> >> I would REALLY like one of the compiler guys to weigh in on this, because >> I'm wondering if collecting all those unions into a struct and calling it >> volatile isn't forcing a 32-bit access. It would be nice to know if I'm >> just bamboozling the optimizer into doing what I want, or if I'm actually >> _telling_ it to do the right thing. >> > > I've duplicated this using a variant of your structs and caused gcc to > generate invalid code for accessing registers. > > I wasn't able to duplicate the range of cases I thought I could, but I > found this in one of my header files I created some years ago: > > * > * Using bitfield structs versus bitmasks to access register bitfields: > * My original plan was to use bitfield structs to access the register > * bitfields. However, gcc is generating code to access the bitfields > * at a byte, and not longword, level. As the MMIO fields are longword > * access only, this results in invalid code been generated. > * > * Marking the enclosing 32-bit struct, instead of the underlying bitfields, > * as volatile results in code been correctly generated for most cases. > * However, when the field to be modified is a byte in the least significant > * 8 bits of the register, then gcc still generates a byte level store > * even though it performed a longword level read. There may also be other > * cases which I have not yet come across. > * > * As a result, I have switched to using bitmasks to access bitfields in a > * register. > * > > so it's clear that _where_ you place the volatile attribute is critical > to how the generated code looks (and how broken it is). A longword is > 32 bits BTW; that's my DEC background showing. :-) > > Here's a self contained example: > > ============================================================================ > /* > * Compiled with: > * > * arm-none-eabi-gcc -g -c -O2 -mcpu=arm926ej-s -o tim.o tim.c > */ > > struct SPeriphRegisterBits > { > // unsigned int bit0 : 1; > // unsigned int bits1_10 : 10; > unsigned int bit0 : 8; > unsigned int bits1_10 : 3; > unsigned int bit11 : 1; > unsigned int : 16; > unsigned int bits28_31 : 4; > }; > > union UPerihRegister > { > struct SPeriphRegisterBits bits; > unsigned int all; > }; > > struct SPeripheralRegs > { > union UPerihRegister REG1; > union UPerihRegister REG2; > }; > > //extern > volatile struct SPeripheralRegs THIS_PERIPHERAL; > > // > //Then I access things by > // > //THIS_PERIPHERAL.REGISTER.bits.whatever = something; > > volatile union UPerihRegister *ur1 = (union UPerihRegister *) 0x80001000; > > volatile struct SPeriphRegisterBits *sr1 = (struct SPeriphRegisterBits *) 0x80001000; > > void tim_test(void) > { > unsigned int val; > > // val = ur1->bits.bit0; > // ur1->bits.bit0 = 1; > > val = sr1->bit0; > > sr1->bit0 = sr1->bit0 + 3; > > THIS_PERIPHERAL.REG1.bits.bit0 = 3; > return; > } > ============================================================================ > > and here's the objdump output: > > ============================================================================ > tim.o: file format elf32-littlearm > > > Disassembly of section .text: > > 00000000 <tim_test>: > unsigned int val; > > // val = ur1->bits.bit0; > // ur1->bits.bit0 = 1; > > val = sr1->bit0; > 0: e59f3024 ldr r3, [pc, #36] ; 2c <tim_test+0x2c> > 4: e5933000 ldr r3, [r3] > 8: e5d32000 ldrb r2, [r3] >On my testing, this second read is ldr. (Note that the first "ldr r3, [r3]" is just because the sr1 pointer is not made const or static.)> sr1->bit0 = sr1->bit0 + 3; > c: e5d32000 ldrb r2, [r3] > 10: e2822003 add r2, r2, #3 > 14: e20220ff and r2, r2, #255 ; 0xff > 18: e5c32000 strb r2, [r3] >On my testing, both accesses are 32-bit. This leads to a little more code between the load and the store, to get the masking right.> THIS_PERIPHERAL.REG1.bits.bit0 = 3; > 1c: e59f300c ldr r3, [pc, #12] ; 30 <tim_test+0x30> > 20: e3a02003 mov r2, #3 > 24: e5c32000 strb r2, [r3]Same again - it's all 32-bit accesses, but the code is more complex for the masking.> return; > } > 28: e12fff1e bx lr > ... > ============================================================================ > > Note the use of ldrb and strb instead of ldr and str in the generated code. > > So it's confirmed that having a 8-bit bitfield in the LSBs of the struct > causes the generated code to be broken, at least in the version of gcc > I am using.All we have confirmed here is that you have a gcc version with the old volatile bitfield bug. (Note that arguably it was not a bug, since the C standards don't say anything about what size accesses to a bitfield should use. gcc used the smallest access because it gave shorter and faster code. But even if it is not technically a bug, it is still a surprise - and leads to problems on architectures that expect accesses to be 32-bit in such circumstances. Hence the option was created, and made standard on targets such as ARM that need it.) What version of gcc are you using here?> > My notes from a few years ago reflect my memory that I was able to > generate broken code for other cases as well, depending on exactly > what was marked as volatile (which admittedly was a factor I had > forgotten). > > You don't happen to have any structs with the lower 8 bits being a > single bitfield by any chance do you ? It would be interesting to > see what objdump says in your case. > > Simon. >
Reply by ●September 24, 20152015-09-24
On 24/09/15 01:12, Simon Clubley wrote:> > The cross compiler I am using is gcc 4.5.{something} although, based > on the other problem reports I have come across, I suspect it's still > a problem in at least some newer versions as well. >That explains it - the issue was fixed in 4.6. There are a few questionable issues in later versions of the compiler, in regard to particularly complex bitfield structures. The trouble here is that volatile accesses are not well defined in C - much of it is left as "implementation defined" (meaning the compiler can choose, but it should be consistent and preferably documented) or even "unspecified" (meaning it has to follow certain behaviour patterns, but the compiler can choose which to use - and it can choose arbitrarily from case to case). As with all uses of volatile, if you try to make the code unreasonably complicated, you run the risk of code that does not do as you hoped - usually because you don't understand it yourself, but sometimes because of imprecise standards definitions and sometimes because of compiler bugs. But as long as you stick to rational usage of volatile bitfields, and gcc 4.6 or later, you should not see any problems.
Reply by ●September 24, 20152015-09-24
Tim Wescott <seemywebsite@myfooter.really> wrote:> I would REALLY like one of the compiler guys to weigh in on this, because > I'm wondering if collecting all those unions into a struct and calling it > volatile isn't forcing a 32-bit access.I'm not a compiler guy, but having looked at bitfields a short time ago, my understanding is that the language standard always allows the compiler to access bitfields in smaller units if it feels like it. GCC has a '-fstrict-volatile-bitfields' flag for disabling this optimization. -a
Reply by ●September 24, 20152015-09-24
On 23/09/15 21:55, Tim Wescott wrote:> This question came up in the thread titled "Re: Ada, was: Re: Fundamental > C question about "if" statements". > > I felt it deserved its own thread, because it's far off topic to what's > become a very interesting, but far off topic thread. >Maybe I should have read this post before answering in the other thread...> I, for one, would really like to see one of our resident compiler experts > answer the question at the bottom.Let me try (see below).> > On Wed, 23 Sep 2015 19:36:57 +0000, Simon Clubley wrote: > >> On 2015-09-23, Tim Wescott <seemywebsite@myfooter.really> wrote: >>> On Wed, 23 Sep 2015 18:36:43 +0000, Simon Clubley wrote: >>>> Good luck doing that in a reliable way on ARM with gcc when you are >>>> accessing device registers. >>>> >>>> If you access a bitfield in the lower 8 bits of a register, the gcc >>>> optimiser can turn that into 8-bit ldrb/strb opcodes instead of 32-bit >>>> ldr/str opcodes and thereby causing the program to break if you need >>>> to access your registers in units of greater than 8 bits. >>> >>> That's a very interesting comment, because that's basically How It Is >>> Done here, and I've had several years worth of success on both ST and >>> TI/ >>> Luminary devices. I'm not sure if that's because of the way I define >>> my bitfields (absolutely everything is defined as a struct of >>> bitfields, then a union of that struct and a 32-bit integer), or if >>> I've been lucky, >>> or what. >>> >>> >> Do your MCUs _require_ register access in units of 16/32 bits ? >> >> If so, you may be doing something which works for now, but may not work >> in the future. >> >> It might be worth having a quick look with objdump just to be sure. >> >> I have personally had this happen to me and it broke my code due to the >> register access size constraints no longer been true. >> >>> So it looks like: >>> >>> struct SPeriphRegisterBits { >>> unsigned int bit0 : 1; unsigned int bits1_10 : 10; unsigned >>> int bit11 : 1; unsigned int : 16; unsigned int >>> bits28_31 : 4; >>> }; >>> >>> union UPerihRegister { >>> struct SPeriphRegisterBits bits; >>> unsigned int all; >>> } >>> >>> All of the various register definitions then get collected into a >>> structure for the peripheral, which gets declared as "extern const", >>> and defined in the linker command file. >>> >>> >> This is a bit more involved than what I was doing. I wonder if this is >> what is saving you for now. > > Yes, my MCU's require access in units of 16 or 32 bits, at least in > places -- at least, if the ST data book is to be believed that's the case.It is the case for many MCU's, including most ARM devices.> > I would REALLY like one of the compiler guys to weigh in on this, because > I'm wondering if collecting all those unions into a struct and calling it > volatile isn't forcing a 32-bit access. It would be nice to know if I'm > just bamboozling the optimizer into doing what I want, or if I'm actually > _telling_ it to do the right thing.As long as you are using gcc 4.6 (or any other compiler that implements volatile bitfield accesses with exactly the field size given in the bitfield definition), you are fine. C does not specify what size such accesses should have, but leaves it up to the implementation. So gcc 4.5 and earlier were technically correct to the C standards - the implementation picked a size that gave smaller and faster code, but that unfortunately did not work on many ARM microcontrollers.> > So, the final definitions (that I left out) is > > struct SPeripheralRegs > { > UPeriphRegister1 REG1; > UPeriphRegister2 REG2; > -- etc -- > }; > > extern volatile SPeripheralRegs THIS_PERIPHERAL; > > Then I access things by > > THIS_PERIPHERAL.REGISTER.bits.whatever = something; > > Is (oh compiler guys!!!) this directing the compiler to act as if I wrote: > > UPeriphRegister A; > > A.all = static_cast<volatile unsigned int>THIS_PERIPHERAL.REGISTER.all; > > A.bits.whatever = something; > > static_cast<volatile unsigned int>THIS_PERIPHERAL.REGISTER.all = A.all; >That's pretty much what happens, yes. With each RMW set done independently for each bitfield access.> I could see how it might do this by the intent of the language, but I > could see that I may just be bamboozling the optimizer, rather than > intelligently directing it. So -- I'm curious, in a more-than-idly sort > of way. >It is implementation dependent (defined by the compiler), but gcc 4.6 onwards defines it to work in the way you expect here. I've pasted some bits from my replies on the other thread here, in case people prefer to reply in this thread. But I will continue to read and comment on both threads. Whether that will be helpful or confusing remains to be seen... On 23/09/15 21:19, Tim Wescott wrote:> > So it looks like: > > struct SPeriphRegisterBits > { > unsigned int bit0 : 1; > unsigned int bits1_10 : 10; > unsigned int bit11 : 1; > unsigned int : 16; > unsigned int bits28_31 : 4; > }; > > union UPerihRegister > { > struct SPeriphRegisterBits bits; > unsigned int all; > }A few comments here. Don't use "unsigned int" - use uint32_t. Make it all absolutely clear and fully specified. You don't need to do this in two parts. Put it altogether, using an anonymous struct (this is a very common extension to C99, which has existed in gcc since close to the dawn of time, and it is now part of C11). I prefer to make typedefs of all my types, so that I don't need union or struct tags later. But that is a matter of style. And I strongly dislike systems Hungarian style, such as prefixing a type with U just because it happens to be a union (especially if you haven't used typedef, so that you have to have the "union" tag too). Make your padding bits explicit with a name. The C++ memory model says that unnamed fields may not be changed by any write operations (C is more lenient), which can cause trouble for fields. It's better to make all your padding explicit. I also like to add a static assertion to check that there are no mistakes when defining structures like these. Of preference I also use "-Wpadded" in gcc to warn of any unexpected padding - but that does not play well with most manufacturer-supplied headers. typedef union { struct { uint32_t bit0 : 1; uint32_t bits1_10 : 10; uint32_t bit11 : 1; uint32_t padding : 16; uint32_t bits28_31 : 4; }; uint32_t all; } periphRegister_t; static_assert(sizeof(periphRegister_t) == 4, "Check bitfield definition for PeriphRegister"); Having said all that, your definition will also work correctly (assuming you've got the volatile there somewhere, and are using gcc 4.6 or above which fixes the volatile bitfield problem).> > All of the various register definitions then get collected into a > structure for the peripheral, which gets declared as "extern const", and > defined in the linker command file.I presume you mean "extern volatile const", and that's only for the read-only registers. It's common to define registers like this: #define periphRegister (*((volatile periphRegister_t*) 0x12345678)) (Registers can, of course, be collected in larger structures first.) The advantage of that is that everything is defined in a C header file - you don't need to mess around with linker files too. And since the compiler knows exactly what addresses are used, it can generate more efficient code than if they have to be sorted out at link time. If you write: void test1(void) { periphRegister.bit0 = 1; periphRegister.bit11 = 0; } then the compiler will (correctly) generate two independent RMW sequences, optimising only the calculations of the address to use. If you want both assignments to be done in a single RMW operation, you must do so explicitly - that's what the "all" field is for. void test2(void) { periphRegister_t cache; cache.all = periphRegister.all; cache.bit0 = 1; cache.bit11 = 0; periphRegister.all = cache.all; }
Reply by ●September 24, 20152015-09-24
Anders.Montonen@kapsi.spam.stop.fi.invalid writes:> Tim Wescott <seemywebsite@myfooter.really> wrote: > >> I would REALLY like one of the compiler guys to weigh in on this, because >> I'm wondering if collecting all those unions into a struct and calling it >> volatile isn't forcing a 32-bit access. > > I'm not a compiler guy, but having looked at bitfields a short time ago, > my understanding is that the language standard always allows the > compiler to access bitfields in smaller units if it feels like it. GCC > has a '-fstrict-volatile-bitfields' flag for disabling this > optimization.The *C language* standard may allow it, but on a particular machine architecture there can be an ABI that specifies these things. For example the ARM eabi. The OP was using the bitfield feature for accessing hardware registers on microcontrollers. So the architecture will be pretty much fixed by definition and it should be safe to use (compilers bugs aside). -- John Devereux
Reply by ●September 24, 20152015-09-24
On Thu, 24 Sep 2015 15:28:59 +0200, David Brown wrote:> On 24/09/15 01:12, Simon Clubley wrote: >> >> The cross compiler I am using is gcc 4.5.{something} although, based on >> the other problem reports I have come across, I suspect it's still a >> problem in at least some newer versions as well. >> >> > That explains it - the issue was fixed in 4.6. > > There are a few questionable issues in later versions of the compiler, > in regard to particularly complex bitfield structures. The trouble here > is that volatile accesses are not well defined in C - much of it is left > as "implementation defined" (meaning the compiler can choose, but it > should be consistent and preferably documented) or even "unspecified" > (meaning it has to follow certain behaviour patterns, but the compiler > can choose which to use - and it can choose arbitrarily from case to > case). > > As with all uses of volatile, if you try to make the code unreasonably > complicated, you run the risk of code that does not do as you hoped - > usually because you don't understand it yourself, but sometimes because > of imprecise standards definitions and sometimes because of compiler > bugs.Or, as I alluded to in my original post, you run the risk of making code that's complicated to confuse the optimizer enough that it throws up its hands and does what you want at the moment, without guaranteeing that future optimizers wouldn't be able to unwind what you did and break it by making it "right". Which is why I was asking about the intent of the volatile keyword WRT unions. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com







