Question About Sequence Points and Interrupt/Thread Safety

I've included a function below and the generated STM8 assembly-language.  As 
it ends up (based on the assembly-language), the function is interrupt safe 
as intended.

My question is, let's assume I have this:

DI();
if (x)
   x--;
EI();

where DI and EI just expand to the compiler's asm( ) feature to insert the 
right machine instruction to disable and enable interrupts, ...

Is there any reason that the compiler cannot delay writing "x" back so that 
I get effectively this:

DI();
cpu_register = x;
if(cpu_register)
   cpu_register--;
EI();
x = cpu_register;

???

It isn't clear to me if "volatile" is required on "x" or if there is any 
possibility of the write of the variable back to memory being delayed.

Thanks for any insight.

Function below.

The Lizard

----------

//--------------------------------------------------------------------------------
//DESCRIPTION
//   Decrements an array of zero or more 8-bit unsigned integers, but not
//   below zero.  This function is intended for software timers, but may 
have
//   other applications as well.
//
//INPUTS
//   in_arg
//      Pointer to first element to be decremented.  This pointer must
//      be valid if in_nelem > 0.
//
//   in_nelem
//      Number of elements to be decremented.  If this value is 0, in_nelem
//      will not be dereferenced and may be NULL or otherwise invalid.
//
//INTERRUPT CONSIDERATIONS
//   This function must be called only with interrupts enabled (it uses 
simple
//   DI/EI protocol).
//
//   This function may be called from non-ISR software only.
//
//   In the case of software timers, individual software timers may be 
safely
//   shared with interrupt service, due to the critical section protocol. 
So,
//   an ISR may safely set and test software timers.  Note that the behavior
//   of individual software timers is guaranteed by DI/EI, but the 
relationship
//   between timers is not, as an interrupt may occur while an array or sets
//   of arrays are being decremented.
//
//MNEMONIC
//   "dec"     : decrement.
//   "u8"      : unsigned 8-bit.
//   "arr"     : array.
//   "nbz"     : not below zero.
//
//UNIT TEST HISTORY
//
//
//--------------------------------------------------------------------------------
void MF_decu8arr_nbz(UINT8 *in_arg, UINT16 in_nelem)
   {
   while (in_nelem)
      {
      DI();
      if (*in_arg)
         (*in_arg)--;
      EI();
      in_nelem--;
      }
   }

 700                     ; 289 void MF_decu8arr_nbz(UINT8 *in_arg, UINT16 
in_nelem)
 700                     ; 290    {
 701                      switch .text
 702  00b4               f_MF_decu8arr_nbz:
 704  00b4 89             pushw x
 705       00000000      OFST: set 0
 708  00b5 200d           jra L552
 709  00b7               L352:
 710                     ; 293       DI();
 713  00b7 9b             sim
 715                     ; 294       if (*in_arg)
 717  00b8 1e01           ldw x,(OFST+1,sp)
 718  00ba f6             ld a,(x)
 719  00bb 2701           jreq L162
 720                     ; 295          (*in_arg)--;
 722  00bd 7a             dec (x)
 723  00be               L162:
 724                     ; 296       EI();
 727  00be 9a             rim
 729                     ; 297       in_nelem--;
 731  00bf 1e06           ldw x,(OFST+6,sp)
 732  00c1 5a             decw x
 733  00c2 1f06           ldw (OFST+6,sp),x
 734  00c4               L552:
 735                     ; 291    while (in_nelem)
 737  00c4 1e06           ldw x,(OFST+6,sp)
 738  00c6 26ef           jrne L352
 739                     ; 299    }
 742  00c8 85             popw x
 743  00c9 87             retf

Reply by Richard Heathfield ●February 24, 20092009-02-24

Jujitsu Lizard said:

<snip>
 
> My question is, let's assume I have this:
> 
> DI();
> if (x)
>    x--;
> EI();
> 
> where DI and EI just expand to the compiler's asm( ) feature to
> insert the right machine instruction to disable and enable
> interrupts, ...
> 
> Is there any reason that the compiler cannot delay writing "x"
> back so that I get effectively this:
> 
> DI();
> cpu_register = x;
> if(cpu_register)
>    cpu_register--;
> EI();
> x = cpu_register;
> 
> ???

3.6: "A full expression is an expression that is not part of another 
expression.  Each of the following is a full expression: an 
initializer; the expression in an expression statement; the 
controlling expression of a selection statement ( if or switch ); 
the controlling expression of a while or do statement; each of the 
three expressions of a for statement; the expression in a return 
statement. The end of a full expression is a sequence point."

Having said that, the "as if" rule applies. If the implementation 
"wants" to delay the assignment to x, it is permitted to do so 
/provided/ that a strictly conforming program can't tell the 
difference.

<snip>

-- 
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Reply by Jujitsu Lizard ●February 24, 20092009-02-24

"Richard Heathfield" <rjh@see.sig.invalid> wrote in message 
news:-q6dnZpxK-fwETnUnZ2dnUVZ8j-WnZ2d@bt.com...
>>
>> Is there any reason that the compiler cannot delay writing "x"
>> back so that I get effectively this:
>>
>> DI();
>> cpu_register = x;
>> if(cpu_register)
>>    cpu_register--;
>> EI();
>> x = cpu_register;
>>
>> ???
>
> 3.6: "A full expression is an expression that is not part of another
> expression.  Each of the following is a full expression: an
> initializer; the expression in an expression statement; the
> controlling expression of a selection statement ( if or switch );
> the controlling expression of a while or do statement; each of the
> three expressions of a for statement; the expression in a return
> statement. The end of a full expression is a sequence point."
>
> Having said that, the "as if" rule applies. If the implementation
> "wants" to delay the assignment to x, it is permitted to do so
> /provided/ that a strictly conforming program can't tell the
> difference.
>
> <snip>

"tell the difference" is a bit of an ambiguous phrase.

I think you are saying that in this case the compiler is not free to delay 
the write of "x" because that would be a logical error -- a conforming ISR 
could in fact "tell the difference", and it would result in logical errors 
in the program.

Am I understanding your response correctly?

Thanks, The Lizard

Reply by Flash Gordon ●February 24, 20092009-02-24

Jujitsu Lizard wrote:
> I've included a function below and the generated STM8 
> assembly-language.  As it ends up (based on the assembly-language), the 
> function is interrupt safe as intended.

None of asm, interrupts or threads are actually part of standard C, they 
are extensions provided by some implementations, and how they work 
differ. You have cross-posted to comp.arch.embedded where there are 
rather more people who know about this sort of thing that on comp.lang.c 
where most of the issues are not really topical. I've set follow-ups to 
comp.arch.embedded for further discussion.

> My question is, let's assume I have this:
> 
> DI();
> if (x)
>   x--;
> EI();
> 
> where DI and EI just expand to the compiler's asm( ) feature to insert 
> the right machine instruction to disable and enable interrupts, ...

OK, so we know what these do, so we can make some educated guesses from 
the standards point of view. However, it is possible that on some 
systems different threads could be running on different processor cores 
and so still be accessing x simultaneously even with interrupts disabled!

> Is there any reason that the compiler cannot delay writing "x" back so 
> that I get effectively this:
> 
> DI();
> cpu_register = x;
> if(cpu_register)
>   cpu_register--;
> EI();
> x = cpu_register;
> 
> ???

No reason at all, since as far as the C standard is concerned the only 
thing that could occur before the next access of x is a signal, and if 
it was acted on by a signal handler it would need to be "volatile 
sig_atomic_t" for the behaviour to be defined.

> It isn't clear to me if "volatile" is required on "x" or if there is any 
> possibility of the write of the variable back to memory being delayed.

 From the standards point of view it needs to be at least volatile and 
it would be best in my opinion for it to be at least "volatile sig_atomic_t"

Since you are talking about threads you should look at what facilities 
the threading implementation you are using provides and what guarantees 
it provides.
-- 
Flash Gordon

Reply by Kaz Kylheku ●February 24, 20092009-02-24

On 2009-02-24, Jujitsu Lizard <jujitsu.lizard@gmail.com> wrote:
> I've included a function below and the generated STM8 assembly-language.  As 
> it ends up (based on the assembly-language), the function is interrupt safe 
> as intended.
>
> My question is, let's assume I have this:
>
> DI();
> if (x)
>    x--;
> EI();
>
> where DI and EI just expand to the compiler's asm( ) feature to insert the 
> right machine instruction to disable and enable interrupts, ...

I.e. if we have delimit a region of code with some compiler-specific
inline assembly magic, are there any requirements that we actually get
a properly implemented critical region?

The answer is, that the standard C language deosn't have any requirements in
htis area. The asm feature is an extension of your compiler (and a
non-conforming one, if it is actually called asm rather than say __asm,
since a C program can use the identifier asm).

Any requirements related to asm, such as the interaction between asm blocks and
surrounding code, can only be found in your compiler's documentation.

For instance, if you are using GNU C, there are special things you must do in
and around your __asm__ constructs to ensure that code is not improperly
reordered.  The GNU compiler allows inline assembly to be quite tightly
integrated into the generated code, and can even allocate registers for you.
I.e. in the inline assembly you can refer to virtual register names, rather
than concrete ones, and associate them with operands denoted by C syntax. The
compiler will find registers for those operands, and generate the loads and
stores to dovetail them into the surrounding code.

If you want to use GNU C inline assembly for things like critical regions,
where there are interactions with other threads or interrupts that are not
obvious to the compiler, you have to inform it that there are ordering and
memory issues.

> Is there any reason that the compiler cannot delay writing "x" back so that 
> I get effectively this:
>
> DI();
> cpu_register = x;
> if(cpu_register)
>    cpu_register--;
> EI();
> x = cpu_register;
>
> ???
>
> It isn't clear to me if "volatile" is required on "x" or if there is any 
> possibility of the write of the variable back to memory being delayed.

The volatile keyword is not particularly useful for concurrency issues. It is
defined by ISO C and has a couple of uses in the standard language, in relation
to signal handlers and setjmp/longjmp.

Whether it's suitable for any other purpose is up to the implementations.  So
volatile may either be too weak to prevent the reordering that you are worried
about, or, on the other extreme, it may be a blunt instrument---i.e. it may
defeat all optimization of the object that is declared volatile!

So if you use volatile, you might actually get the code

  cpu_register_0 = x

  if (cpu_register_0 != 0) {
    cpu_register_1 = x
    cpu_register_1--
    x = cpu_register_1
  }

I.e. since your code accesses x twice and stores once, and x is volatile,
the generated code may also access twice.

But when you have critical regions of code accessing shared data, you don't
want such unoptimized access to data. You only want to stop optimization across
the entry and exit to the critical region, not everywhere! You want correct
concurrency, but not at the cost of poor code.

This is why some of the popular multithreading interfaces, like POSIX threads
and Win32, do not require volatile qualification on shared data.  They
stipulate that acquiring and releasing a proper synchronization object is
enough. e.g. if you call pthread_mutex_lock, then writes performed before the
call are settled, and no premature reads have taken place.

Ideally, you want your DI and EI macros to behave the same way. The DI macro
should provide the assurance that accesses to objects prior to DI have
completed, and accesses which happen after have not yet begun.

Your compiler's documentation must explain how to do this. If it does not, you
can try your luck in various ways, like investigating the compiler's actual
behavior.

It may be enough to put DI and EI into external functions (functions defined in
a different translation unit from everything that calls them).

Since x is a shared variable, then the compiler must supect that an external
function call may modify x --- unless it is devilishly clever and can prove
otherwise. That is to say, the assignment to x cannot be delayed until after
EI(), because EI is an external function which can interact with x.
Even if x is a block-scope static varaible, EI could conceivably recurse back
into this function:

  static void local_fun()
  {
     extern void DI(void);
     extern void EI(void);

     static int shared_x;

     /* ... */

     DI();
     shared_x++;
     EI();

     /* ... */
  }

Without knowing anything about EI and DI, we can't prove that they don't
recurse into local_fun somehow, in which case shared_x must have the old
upon the call to DI, and the new value before the call to EI.

If you put the interrupt manipulation into external functions, there is a good
likelihood that it will work. Of course, you have to review the generated code,
and that would be a last resort, if you cannot coax the behavior out of the
inlined versions.

Reply by Keith Thompson ●February 24, 20092009-02-24

Richard Heathfield <rjh@see.sig.invalid> writes:
> Jujitsu Lizard said:
>
> <snip>
>  
>> My question is, let's assume I have this:
>> 
>> DI();
>> if (x)
>>    x--;
>> EI();
>> 
>> where DI and EI just expand to the compiler's asm( ) feature to
>> insert the right machine instruction to disable and enable
>> interrupts, ...
>> 
>> Is there any reason that the compiler cannot delay writing "x"
>> back so that I get effectively this:
>> 
>> DI();
>> cpu_register = x;
>> if(cpu_register)
>>    cpu_register--;
>> EI();
>> x = cpu_register;
>> 
>> ???
>
> 3.6: "A full expression is an expression that is not part of another 
> expression.  Each of the following is a full expression: an 
> initializer; the expression in an expression statement; the 
> controlling expression of a selection statement ( if or switch ); 
> the controlling expression of a while or do statement; each of the 
> three expressions of a for statement; the expression in a return 
> statement. The end of a full expression is a sequence point."
>
> Having said that, the "as if" rule applies. If the implementation 
> "wants" to delay the assignment to x, it is permitted to do so 
> /provided/ that a strictly conforming program can't tell the 
> difference.

And of course a program that uses DI() and EI(), assuming they're as
Jujitsu Lizard describes, them, cannot be strictly conforming.

But the as-if rule doesn't refer just to strictly conforming programs.
Though the standard itself doesn't use the term, the index entry for
"as-if rule" points to 5.1.2.3, where paragraph 3 says:

    In the abstract machine, all expressions are evaluated as
    specified by the semantics. An actual implementation need not
    evaluate part of an expression if it can deduce that its value is
    not used and that no needed side effects are produced (including
    any caused by calling a function or accessing a volatile object).

And paragraph 4 may also be relevant here:

    When the processing of the abstract machine is interrupted by
    receipt of a signal, only the values of objects as of the previous
    sequence point may be relied on. Objects that may be modified
    between the previous sequence point and the next sequence point
    need not have received their correct values yet.

But I think that a program that uses DI() and EI() isn't just
not-strictly-conforming; I think it's all the way into the realm of
undefined behavior (which merely means behavior that isn't defined by
the standard).  I think your question can be answered only by the
documentation for your implementation.

-- 
Keith Thompson (The_Other_Keith) kst@mib.org  <http://www.ghoti.net/~kst>
Nokia
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

Reply by Richard Heathfield ●February 24, 20092009-02-24

Jujitsu Lizard said:

> "Richard Heathfield" <rjh@see.sig.invalid> wrote in message
> news:-q6dnZpxK-fwETnUnZ2dnUVZ8j-WnZ2d@bt.com...
>>>
>>> Is there any reason that the compiler cannot delay writing "x"
>>> back so that I get effectively this:
>>>
>>> DI();
>>> cpu_register = x;
>>> if(cpu_register)
>>>    cpu_register--;
>>> EI();
>>> x = cpu_register;
>>>
>>> ???
>>
>> 3.6: "A full expression is an expression that is not part of
>> another
>> expression.  Each of the following is a full expression: an
>> initializer; the expression in an expression statement; the
>> controlling expression of a selection statement ( if or switch );
>> the controlling expression of a while or do statement; each of
>> the three expressions of a for statement; the expression in a
>> return statement. The end of a full expression is a sequence
>> point."
>>
>> Having said that, the "as if" rule applies. If the implementation
>> "wants" to delay the assignment to x, it is permitted to do so
>> /provided/ that a strictly conforming program can't tell the
>> difference.
>>
>> <snip>
> 
> "tell the difference" is a bit of an ambiguous phrase.

Okay, I'll amplify it a bit. Imagine one strictly conforming 
program, and two otherwise identical compilers, one of which delays 
the assignment to x and one of which does not. Compile it with each 
compiler and run it with identical input. If the two invocations of 
the program (one under each compiler) produce identical output, 
then the strictly conforming program failed to tell the difference, 
so the delay is okay. But if they don't, then the delay is not 
okay.

> I think you are saying that in this case the compiler is not free
> to delay the write of "x" because that would be a logical error --

It's allowed to delay the x write /provided/ that the same results 
are produced as in an otherwise identical compiler that does not 
delay the x write.

-- 
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Reply by Gordon Burditt ●February 24, 20092009-02-24

>I've included a function below and the generated STM8 assembly-language.  As 
>it ends up (based on the assembly-language), the function is interrupt safe 
>as intended.

My interpretation:  since you used asm(), all bets are off from the
point of view of the C standard.  Among other things, the compiler
doesn't know that the EI or DI instructions don't clobber registers
that it's using.  Even with some assembly languages and machine
code, instructions are not necessarily executed in the order they
are in the code.  I suspect that your processor does not schedule
instructions like this.

If you have a good compiler, it might assume that instructions
unknown to it introduced with asm() potentially clobber almost
everything, so it has to push most of the registers and restore
them afterward.  This may kill your speed but guarantee ordering.

gcc's version of asm() allows you to indicate what registers are
potentially clobbered and which aren't.  Done carefully, this lets
you minimize unnecessary register-saving.

If you use threads, there's no guarantee that sequence points will
work *between* threads.  I'd expect that sequence points will work OK
between some stuff and other stuff in the same thread.

>
>My question is, let's assume I have this:
>
>DI();
>if (x)
>   x--;
>EI();
>
>where DI and EI just expand to the compiler's asm( ) feature to insert the 
>right machine instruction to disable and enable interrupts, ...
>
>Is there any reason that the compiler cannot delay writing "x" back so that 
>I get effectively this:
>
>DI();
>cpu_register = x;
>if(cpu_register)
>   cpu_register--;
>EI();
>x = cpu_register;
>
>???

No.  And there's no reason at the end of this that it won't hold on to
cpu_register as containing a valid copy of x for use later on even
if storing the value back to x comes before EI().

>It isn't clear to me if "volatile" is required on "x" 

It wouldn't hurt, except perhaps in performance, and it might help in
the situation above where it remembers that cpu_register contains a valid
copy of x when it might not be - volatile in that situation might
eliminate the problem.

>or if there is any 
>possibility of the write of the variable back to memory being delayed.

You really can't look to the C standard for this.  You have to rely on
guarantees made by the compiler, which usually aren't much.

Reply by Jujitsu Lizard ●February 24, 20092009-02-24

Addendum to my previous post ...

I did a little more dinking with the compiler, and I'm convinced now that it 
can't be fully trusted.

Here is the code (just dinking around):

UINT8 in_nelem22;

void MF_decu8arr_nbz(UINT8 *in_arg, UINT16 in_nelem)
   {
   while (in_nelem)
      {
      DI();
      in_nelem22 += 3;
      EI();
      in_nelem22 += 5;
      in_nelem22 += 7;
      in_nelem++;
      }
   }

and here is what it got me:

701                     ; 291 void MF_decu8arr_nbz(UINT8 *in_arg, UINT16 
in_nelem)
 701                     ; 292    {
 702                      switch .text
 703  00b4               f_MF_decu8arr_nbz:
 705  00b4 89             pushw x
 706       00000000      OFST: set 0
 709  00b5 200f           jra L552
 710  00b7               L352:
 711                     ; 295       DI();
 714  00b7 9b             sim
 716                     ; 296       in_nelem22 += 3;
 718  00b8 c60000         ld a,_in_nelem22
 719  00bb ab03           add a,#3
 720                     ; 297       EI();
 723  00bd 9a             rim
 725                     ; 298       in_nelem22 += 5;
 727  00be ab0c           add a,#12
 728                     ; 299       in_nelem22 += 7;
 730  00c0 c70000         ld _in_nelem22,a
 731                     ; 300       in_nelem++;
 733  00c3 5c             incw x
 734  00c4 1f06           ldw (OFST+6,sp),x
 735  00c6               L552:
 736                     ; 293    while (in_nelem)
 738  00c6 1e06           ldw x,(OFST+6,sp)
 739  00c8 26ed           jrne L352
 740                     ; 302    }
 743  00ca 85             popw x
 744  00cb 87             retf

Note that the EI() caused it to split up the additions into +3 and +12, but 
that IT DID NOT WRITE THE VARIABLE BACK TO MEMORY UNTIL LATER (it kept the 
contents in the accumulator).

This is a danger sign.

It means in a complex code sequence with DI() and EI(), it might get me!

Thanks for all the information and advice.

The Lizard

Reply by Kaz Kylheku ●February 24, 20092009-02-24

On 2009-02-25, Jujitsu Lizard <jujitsu.lizard@gmail.com> wrote:
> Addendum to my previous post ...
>
> I did a little more dinking with the compiler, and I'm convinced now that it 
> can't be fully trusted.

What if you make DI and EI into external functions, in a separate compilation
unit? That could just be the silver bullet for this compiler; no need to
snapshot the assembly language.

The external calls are a cost (and add to the amount of time you spend with
interrupts disabled!) but maybe it's a wortwhile tradeoff, if it works.

Also, in spite of what I wrote about volatile in the other article, it may also
work under this compiler, and you can minimize the performance-hurting aspects
of volatile by using non-volatile temporaries.

I.e. suppose we want this:

  DI();
  if (condition(x))
     x++;
  EI();

If x is volatile int, the semantics is that there are two accesses to x and one
store. But suppose you have a local variable temp of the same type as x, but
not volatile:

  DI();
  {
    int temp = x;
    if (condition(temp))
      x = temp + 1;
  }
  EI();

Now we are back to one access and one store, the minimum required. We are
hoping that the compiler can optimize away temp entirely.

It would be irksome to rewrite all critical region code this way, though.

> Note that the EI() caused it to split up the additions into +3 and +12, but 
> that IT DID NOT WRITE THE VARIABLE BACK TO MEMORY UNTIL LATER (it kept the 
> contents in the accumulator).

It's obvious that no memory writes at all take place in the critical region
between the ``sim'' and ``rim''.

Previous12 3 4 5 6 Next

Question About Sequence Points and Interrupt/Thread Safety

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group