On 13/12/2012 14:06, Simon Clubley wrote:
> On 2012-12-13, David Brown <david@westcontrol.removethisbit.com> wrote:
>> On 13/12/2012 01:13, Simon Clubley wrote:
>>> On 2012-12-12, Tim Wescott <tim@seemywebsite.com> wrote:
>>>>
>>>> (Totally aside: the gnu in-line assembly syntax is my all-time favorite,
>>>> and if it weren't for the many obvious difficulties I'd wish that it were
>>>> adopted as a standard. As far as I can tell, it solves just about all of
>>>> the problems with embedded inline assembly that _can_ be solved; it's in
>>>> such a different league from the inline assembly syntax that just dumps
>>>> your text into the assembly file that it should almost have a different
>>>> name.)
>>>>
>>>
>>> One bit of advice I will offer for gcc inline assembly is to make sure
>>> the asm fragments are always marked as volatile. There's a very good
>>> reason for this attribute been supported on the asm statement.
>>>
>>> Simon.
>>>
>>
>> Well, that's good advice for people that are not confident with gcc
>> inline assembly (and it's syntax is powerful enough that it can take a
>> /long/ time to be confident!). But the good reason for having
>> "volatile" support for asm statements is that you /don't/ always want it
>> - if you always needed it, it would not exist (as is the case for many
>> other compilers' inline assembly syntax).
>>
>
> That's actually a very good point for when you are doing calculations.
>
> However, I use inserted assembly code only for control type operations
> (read a processor register, disable/enable interrupts, etc) and
> that's what I was thinking of when I responded to Tim.
Yes, that's a common use of inline assembly - probably the most common
use case. I just wanted to point out that there are other cases.
Another thing worth mentioning here is that "volatile" is /not/ enough
to make things like enable/disable interrupts work as you might expect.
People often think "volatile" means "do exactly what I say, when I say
it". They forget the qualifier "with respect to other volatile
accesses". A classic mistake is to do something like this:
asm volatile("disableinterrupts");
x += 1; // Atomic increment of x
asm volatile("enableinterrupts");
Unless "x" is also declared volatile, the compiler is free to move reads
or writes to "x" around the "asm volatile" statements. It can even omit
the read and write altogether if it feels it is unnecessary (say, if
this code were followed by "x = 0;").
The key to making this work correctly every time is to add a "memory
clobber" to the assembly statement. "clobber" lists tell gcc which
resources might be used or changed by the assembly statement in addition
to those in the input and output lists. In particular, a "memory"
clobber tells gcc that the assembly may refer to data in memory (so any
pending writes need to be handled before the asm statement), and that
memory might be changed by the asm statement (so any reads must be done
after the statement). Thus the enable/disable interrupt code needs to
be written as:
asm volatile("disableinterrupts" ::: "memory");
x += 1;
asm volatile("enableinterrupts" ::: "memory");
>
> In my case, it's far less about not been confident and more about needing
> the code inserted at the point I choose without the compiler trying to
> second guess me.
>
>> If an inline assembly statement does not return any values, then gcc
>> automatically considers it "volatile", because there would be little
>> point in having it otherwise.
>>
>
> I know, but I still mark the code as volatile in that case. It's a
> style thing for me in that I like to make my intentions clear in this
> type of code even when the default setting achieves the same thing.
> (I only write the code once (hopefully :-)), but I read it many times.)
>
I agree, and write the volatile explicitly myself too.
>> But if you have inline assembly for calculations, you don't want it to
>> be volatile. Perhaps you have a 16-bit processor that has an
>> instruction for doing a 16x16 bit multiply and returning the top 16
>> bits. Using that instruction is going to be a lot faster than the C
>> method, which requires a 32x32 bit multiply and a shift. So you write a
>> function like this:
>>
>> static inline uint16_t multTop(uint16_t a, uint16_t b) {
>> uint16_t m = a;
>> asm (" multtop %[r], %[x]" : [r] "+r" (m) : [x] "r" (b));
>> return m;
>> }
>>
>> You can then use this function in code, and the "function call" will use
>> just one assembly instruction - assuming the operands are already in a
>> register (the local variable and return statement will not lead to any
>> extra moves). But because it is not "volatile", the compiler can assume
>> that when called with the same values, it gives the same result. This
>> means it can apply common subexpression elimination, code hoisting, and
>> other optimisations in exactly the same way as it would with normal C
>> arithmetic.
>>
>
> I fully agree with this and this wasn't a usage case I was considering
> when I replied to Tim.
>
>> "Volatile" on assembly statements may not mean what you think it means.
>
> Oh, I made sure I knew what it meant before I started using it. :-)
I would have written "yous", if the second person plural existed in
English (outside of Glasgow).
> For example:
>
>> For example, if you have a loop with a volatile assembly statement
>> inside, and the compiler has unrolled the loop, the assembly statement
>> will turn up twice in the output assembly.
>
> This is _exactly_ what I want. If I am protecting a variable update
> within a loop, I want the protection to be unrolled along with the
> variable I am updating. [*]
>
>> And if the compiler knows
>> that a volatile assembly statement will never be reached by the code,
>> then it will be omitted entirely. And of course, just like any other
>> volatile access in C, the compiler can move it around freely as long as
>> it doesn't change the order with respect to other volatile accesses.
>>
>
> These are good points. It's because of this that when I starting using
> inserted assembly language, I spent a good deal of time experimenting
> with various variants and looking at the generated code with objdump
> to make sure I really understood what was going on.
>
I find listing files from the compiler to be easier to read than
objdumps, but I fully agree that you have to look at the generated
assembly code to be sure of exactly what is going on.
>>
>> Part of the power of gcc's inline assembly is that you can express
>> exactly what you mean, including the constraints you have - and let the
>> optimiser work within those constraints. Blindly putting "volatile" on
>> all assembly statements ties the compiler's hands.
>>
>
> [*] You may ask (quite appropriately) why I simply don't use the builtin
> primitives for the environment that the code is running in.
>
I was not going to ask, because I can think of several reasons (in
addition to the ones you mention here). When working with CodeWarrior
on a PPC, I found that in some cases my own gcc inline assembly
functions were smaller, faster, more type-safe, better named, and easier
to understand than the compiler's builtin intrinsics.
> The answer is that the bare metal libraries I have developed run on a
> range of environments from small 8-bit PICs to 32-bit ARM MCUs and all
> the primitives are architecture (and compiler) dependant.
>
> Those primitives may not always provide what I need either. For example,
> I can't just blindly disable and re-enable interrupts when updating a
> variable in a routine called from a interrupt handler because in some
> MCUs (ie: the PIC18) interrupts are fully disabled in a handler while in
> others (ARM with priority nesting) interrupts are only partially disabled.
>
> OTOH, when those same routines are called from mainline code, I absolutely
> _must_ disable and enable interrupts always regardless of MCU.
>
> In order to be able to use a common source base across a range of
> architectures with different behaviours, I have my own set of primitives
> which read and save the current context and restore it later on
> in a routine. This is what started me down the whole embedded assembly
> code path in the first place; all the architecture specific stuff
> is isolated in one header file.
>
> As you can see, I have given this quite a bit of thought. :-)
>
Indeed you have - but other less experienced people read these posts,
and I think between us we have given them quite a lot of information here.
> Simon.
>