EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Help with interrupt software routine and non-atomic operations

Started by pozz April 28, 2016
I have an ISR that uses a struct with two members:

struct mystruct {
   const void *data;
   size_t size;
} *curr, *next, curr_s, next_s;

void ISR(void) {
   static const unsigned char *d;
   if (next) {
     *curr = *next;
     next = NULL;
     d = curr->data;
   }
   if (curr) {
     /* Process curr structure, byte by byte */
     ...
     if (d - (const unsigned char *)curr->data ==
             curr->size) {
       curr = NULL;
     }
   }
}

Now, to activate a new data structure I wrote the following routine:

void foo(const void *data, size_t size) {
   if ((volatile struct mystruct *)curr == NULL) {
     curr_s.data = data;
     curr_s.size = size;
     curr = &curr_s;
   } else {
(*) next_s.data = data;
     next_s.size = size;
     next = &next_s;
   }
}

If a current data structure is present, it should be stopped and the new 
one should take place.

The first if guarantees that there isn't a current structure, so curr 
and curr_s can be changed without any problem.

The second if happens when a current structure really exists. In this 
case, I can't change curr_s and curr directly, so I use a temporary data 
structure next_s and an atomic flag next (I'm working on 32-bits 
architecture, so the assignment is atomic). The next data will take 
place in the next ISR.

Now suppose current structure isn't present (curr == NULL) and I call 
foo() three times:

   foo(d1, s1);
   foo(d2, s2);
   foo(d3, s3);

I would like to have finally (d3,s3). After first foo(), the first if is 
followed and curr_s data structure and curr flag is filled.
After second foo(), the second if is followed and next_s/next are filled.

What happens with the third foo()? What happens if the ISR is fired 
exactly *after* line marked with (*) above?
The ISR will use next_s data structure, but it contains uncoherent data 
values: data points to d3, but size is s2.

How to avoid this problem?

Just to explain my problem. The ISR is a timer interrupt that generates 
and audio stream at a sampling frequency on a DAC output. foo() can be 
named audio_play().
On Thu, 28 Apr 2016 15:15:38 +0200, pozz wrote:

> I have an ISR that uses a struct with two members: > > struct mystruct { > const void *data; > size_t size; > } *curr, *next, curr_s, next_s; > > void ISR(void) { > static const unsigned char *d; > if (next) { > *curr = *next; > next = NULL; > d = curr->data; > } > if (curr) { > /* Process curr structure, byte by byte */ ... > if (d - (const unsigned char *)curr->data == > curr->size) { > curr = NULL; > } > } > } > > Now, to activate a new data structure I wrote the following routine: > > void foo(const void *data, size_t size) { > if ((volatile struct mystruct *)curr == NULL) { > curr_s.data = data; > curr_s.size = size; > curr = &curr_s; > } else { > (*) next_s.data = data; > next_s.size = size; > next = &next_s; > } > } > > If a current data structure is present, it should be stopped and the new > one should take place. > > The first if guarantees that there isn't a current structure, so curr > and curr_s can be changed without any problem. > > The second if happens when a current structure really exists. In this > case, I can't change curr_s and curr directly, so I use a temporary data > structure next_s and an atomic flag next (I'm working on 32-bits > architecture, so the assignment is atomic). The next data will take > place in the next ISR. > > Now suppose current structure isn't present (curr == NULL) and I call > foo() three times: > > foo(d1, s1); > foo(d2, s2); > foo(d3, s3); > > I would like to have finally (d3,s3). After first foo(), the first if is > followed and curr_s data structure and curr flag is filled. > After second foo(), the second if is followed and next_s/next are > filled. > > What happens with the third foo()? What happens if the ISR is fired > exactly *after* line marked with (*) above? > The ISR will use next_s data structure, but it contains uncoherent data > values: data points to d3, but size is s2. > > How to avoid this problem? > > Just to explain my problem. The ISR is a timer interrupt that generates > and audio stream at a sampling frequency on a DAC output. foo() can be > named audio_play().
Normally you identify those bits of code that need to be atomic but aren't, you turn off interrupts for JUST THAT PART, and then you turn interrupts back on. You have to accept that your interrupt latency will increase a bit (and you want keep a stick handy, to whack any developer who wraps a protection block around more than the absolute minimum of code). -- www.wescottdesign.com
On 28/04/16 15:15, pozz wrote:
> I have an ISR that uses a struct with two members: > > struct mystruct { > const void *data; > size_t size; > } *curr, *next, curr_s, next_s; > > void ISR(void) { > static const unsigned char *d; > if (next) { > *curr = *next; > next = NULL; > d = curr->data; > } > if (curr) { > /* Process curr structure, byte by byte */ > ... > if (d - (const unsigned char *)curr->data == > curr->size) { > curr = NULL; > } > } > } > > Now, to activate a new data structure I wrote the following routine: > > void foo(const void *data, size_t size) { > if ((volatile struct mystruct *)curr == NULL) { > curr_s.data = data; > curr_s.size = size; > curr = &curr_s; > } else { > (*) next_s.data = data; > next_s.size = size; > next = &next_s; > } > } > > If a current data structure is present, it should be stopped and the new > one should take place. > > The first if guarantees that there isn't a current structure, so curr > and curr_s can be changed without any problem.
Your code above does not guarantee that curr_s is set appropriately before curr is set. The compiler is free to store the address of curr_s in curr before setting curr_s.data and curr_s.size. To force the desired ordering, you have two choices. You can make sure curr_s /and/ curr are both accessed as volatile (by using casts to volatile, or by simply declaring the variables as volatile). Or you can use a memory barrier to stop the movement of accesses back and forth: asm volatile("" ::: "memory"); (assuming gcc or a compiler that supports gcc inline assembly). Unless you are sure you know what you are doing with memory barriers, and with getting volatile casts in exactly the right places, it is usually easier to declare the relevant variables as volatile, at the risk of slightly suboptimal code generation. And that means /all/ the relevant variables - volatile ordering does not affect the ordering of non-volatile accesses.
> > The second if happens when a current structure really exists. In this > case, I can't change curr_s and curr directly, so I use a temporary data > structure next_s and an atomic flag next (I'm working on 32-bits > architecture, so the assignment is atomic). The next data will take > place in the next ISR. >
Again, you need volatiles or a memory barrier to force the setting of next_s before the setting of next.
> Now suppose current structure isn't present (curr == NULL) and I call > foo() three times: > > foo(d1, s1); > foo(d2, s2); > foo(d3, s3); > > I would like to have finally (d3,s3). After first foo(), the first if is > followed and curr_s data structure and curr flag is filled. > After second foo(), the second if is followed and next_s/next are filled. > > What happens with the third foo()?
If the ISR has not fired, then next will be filled with d3, s3.
> What happens if the ISR is fired > exactly *after* line marked with (*) above?
Trouble.
> The ISR will use next_s data structure, but it contains uncoherent data > values: data points to d3, but size is s2. > > How to avoid this problem? > > Just to explain my problem. The ISR is a timer interrupt that generates > and audio stream at a sampling frequency on a DAC output. foo() can be > named audio_play().
There are two ways to handle this that I can think of. One method is to disable interrupts (or at least the timer interrupt) around the critical section. As long as you remember to use volatile accesses or memory barriers appropriately, that should be easy to get right (for convenience, your interrupt enable/disable macros should include memory barriers). Other than that, you basically have a two entry FIFO queue with one reader client (the ISR) and one writer client (foo). This can be implemented as a 3 entry ring buffer with separate head and tail pointers. The writer only ever writes to the tail pointer and the buffer, the reader only ever writes to the head pointer. As each part can only be written by one task, you can keep them consistent. However, a simple implementation will only work if you drop new values when your FIFO is full - it gets messy if you want to be able to overwrite the final entry with new data.
On 28/04/16 22:14, Tim Wescott wrote:

> > Normally you identify those bits of code that need to be atomic but > aren't, you turn off interrupts for JUST THAT PART, and then you turn > interrupts back on. > > You have to accept that your interrupt latency will increase a bit (and > you want keep a stick handy, to whack any developer who wraps a > protection block around more than the absolute minimum of code). >
Turning off interrupts for a bit is often the simplest way, but just be /very/ careful that you really do it correctly. Something like: curr_s.data = data; curr_s.size = size; asm volatile("cpsid i"); curr = &curr_s; asm volatile("cpsie i"); is /not/ good enough. The compiler can freely move the writes to curr_s inside the critical zone (leading to unpleasant whacking), or it can move the write to curr outside the critical zone (leading to everything working perfectly until the customer demonstration, when it fails). It can even move the writes to curr_s after the write to curr, and after the critical zone - who knows what will happen then. Make sure you have memory barriers, such as these (for ARM Cortex-M devices): static inline void disableGlobalInterrupts(void) { asm volatile("cpsid i" ::: "memory"); } static inline void enableGlobalInterrupts(void) { asm volatile("cpsie i" ::: "memory"); }
On Thu, 28 Apr 2016 23:42:39 +0200, David Brown wrote:

> On 28/04/16 22:14, Tim Wescott wrote: > > >> Normally you identify those bits of code that need to be atomic but >> aren't, you turn off interrupts for JUST THAT PART, and then you turn >> interrupts back on. >> >> You have to accept that your interrupt latency will increase a bit (and >> you want keep a stick handy, to whack any developer who wraps a >> protection block around more than the absolute minimum of code). >> >> > Turning off interrupts for a bit is often the simplest way, but just be > /very/ careful that you really do it correctly. Something like: > > > curr_s.data = data; > curr_s.size = size; > asm volatile("cpsid i"); > curr = &curr_s; > asm volatile("cpsie i"); > > > is /not/ good enough. The compiler can freely move the writes to curr_s > inside the critical zone (leading to unpleasant whacking), or it can > move the write to curr outside the critical zone (leading to everything > working perfectly until the customer demonstration, when it fails). It > can even move the writes to curr_s after the write to curr, and after > the critical zone - who knows what will happen then. > > > Make sure you have memory barriers, such as these (for ARM Cortex-M > devices): > > > static inline void disableGlobalInterrupts(void) { > asm volatile("cpsid i" ::: "memory"); > } > > static inline void enableGlobalInterrupts(void) { > asm volatile("cpsie i" ::: "memory"); > }
Something just went on my "to be fixed" queue. I assume that I've been getting away with it by luck. At any rate, to provide the full-meal-deal you also save the current state of interrupts, so that you don't turn the turned-off interrupts off, and then turn them on inappropriately. inline int disable_interrupts(void) { int primask_copy; asm("mrs %[primask_copy], primask\n\t" // save interrupt status "cpsid i\n\t" // disable interrupts : [primask_copy] "=r" (primask_copy) :: "memory"); return primask_copy; } inline void enable_interrupts(int primask) { int primask_copy = primask; // Restore interrupts to their previous value asm("msr primask, %[primask_copy]" : : [primask_copy] "r" (primask_copy) : "memory"); } Then in the code you write: int interrupt_state = disable_interrupts(); // Put your teeny bit of critical stuff here enable_interrupts(interrupt_state); I _hope_ that this works right for all cases -- I'm interested in what David has to say about it. (It's not guaranteed to compile -- I just added the "memory" stuff; I need to go put it into my code) -- Tim Wescott Control systems, embedded software and circuit design I'm looking for work! See my website if you're interested http://www.wescottdesign.com
Il 28/04/2016 22:26, David Brown ha scritto:
> On 28/04/16 15:15, pozz wrote: >> I have an ISR that uses a struct with two members: >> >> struct mystruct { >> const void *data; >> size_t size; >> } *curr, *next, curr_s, next_s; >> >> void ISR(void) { >> static const unsigned char *d; >> if (next) { >> *curr = *next; >> next = NULL; >> d = curr->data; >> } >> if (curr) { >> /* Process curr structure, byte by byte */ >> ... >> if (d - (const unsigned char *)curr->data == >> curr->size) { >> curr = NULL; >> } >> } >> } >> >> Now, to activate a new data structure I wrote the following routine: >> >> void foo(const void *data, size_t size) { >> if ((volatile struct mystruct *)curr == NULL) { >> curr_s.data = data; >> curr_s.size = size; >> curr = &curr_s; >> } else { >> (*) next_s.data = data; >> next_s.size = size; >> next = &next_s; >> } >> } >> >> If a current data structure is present, it should be stopped and the new >> one should take place. >> >> The first if guarantees that there isn't a current structure, so curr >> and curr_s can be changed without any problem. > > Your code above does not guarantee that curr_s is set appropriately > before curr is set. The compiler is free to store the address of curr_s > in curr before setting curr_s.data and curr_s.size.
You are right, I didn't thought about this situation.
> To force the desired ordering, you have two choices. You can make sure > curr_s /and/ curr are both accessed as volatile (by using casts to > volatile, or by simply declaring the variables as volatile). Or you can > use a memory barrier to stop the movement of accesses back and forth: > > asm volatile("" ::: "memory"); > > (assuming gcc or a compiler that supports gcc inline assembly). > > Unless you are sure you know what you are doing with memory barriers, > and with getting volatile casts in exactly the right places, it is > usually easier to declare the relevant variables as volatile, at the > risk of slightly suboptimal code generation. And that means /all/ the > relevant variables - volatile ordering does not affect the ordering of > non-volatile accesses.
Does the following code fix that problem? void foo(const void *data, size_t size) { volatile struct mystruct *c = (volatile struct mystruct *)curr; if (c == NULL) { curr_s.data = data; curr_s.size = size; c = &curr_s; } else { /* ... */ } } Now is the instruction "c = &curr_s;" *guaranteed* to be executed *after* curr_s assignements?
>> The second if happens when a current structure really exists. In this >> case, I can't change curr_s and curr directly, so I use a temporary data >> structure next_s and an atomic flag next (I'm working on 32-bits >> architecture, so the assignment is atomic). The next data will take >> place in the next ISR. >> > > Again, you need volatiles or a memory barrier to force the setting of > next_s before the setting of next.
Yes.
>> Now suppose current structure isn't present (curr == NULL) and I call >> foo() three times: >> >> foo(d1, s1); >> foo(d2, s2); >> foo(d3, s3); >> >> I would like to have finally (d3,s3). After first foo(), the first if is >> followed and curr_s data structure and curr flag is filled. >> After second foo(), the second if is followed and next_s/next are filled. >> >> What happens with the third foo()? > > If the ISR has not fired, then next will be filled with d3, s3. > >> What happens if the ISR is fired >> exactly *after* line marked with (*) above? > > Trouble. > >> The ISR will use next_s data structure, but it contains uncoherent data >> values: data points to d3, but size is s2. >> >> How to avoid this problem? >> >> Just to explain my problem. The ISR is a timer interrupt that generates >> and audio stream at a sampling frequency on a DAC output. foo() can be >> named audio_play(). > > There are two ways to handle this that I can think of. One method is to > disable interrupts (or at least the timer interrupt) around the critical > section. As long as you remember to use volatile accesses or memory > barriers appropriately, that should be easy to get right (for > convenience, your interrupt enable/disable macros should include memory > barriers).
Yes, disabling interrupts is the simplest solution, but I usually try to find a more elegant solution... if any exists.
> Other than that, you basically have a two entry FIFO queue with one > reader client (the ISR) and one writer client (foo). This can be > implemented as a 3 entry ring buffer with separate head and tail > pointers. The writer only ever writes to the tail pointer and the > buffer, the reader only ever writes to the head pointer. As each part > can only be written by one task, you can keep them consistent. However, > a simple implementation will only work if you drop new values when your > FIFO is full - it gets messy if you want to be able to overwrite the > final entry with new data.
But I need to keep the last pushed data, not dropping it. What about TWO next data structures? When a foo() is called and current structure is busy, the free next data structure is filled and the next pointer is then assigned to the new data structure. struct mystruct { const void *data; size_t size; } *curr, *next, curr_s, next1_s, next2_s; void foo(const void *data, size_t size) { volatile struct mystruct *c = (volatile struct mystruct *)curr; volatile struct mystruct *n = (volatile struct mystruct *)next; if (c == NULL) { curr_s.data = data; curr_s.size = size; c = &curr_s; } else if ((n == NULL) || (n == &next2_s)) { // Use next1 next1_s.data = data; next1_s.size = size; n = &next1_s; } else { // Use next2 next2_s.data = data; next2_s.size = size; n = &next2_s; } } If volatile cast guarantees that the order of the three instructions inside if branches isn't changed, I think this solution works in any case.
On 29/04/16 06:00, Tim Wescott wrote:
> On Thu, 28 Apr 2016 23:42:39 +0200, David Brown wrote: > >> On 28/04/16 22:14, Tim Wescott wrote: >> >> >>> Normally you identify those bits of code that need to be atomic but >>> aren't, you turn off interrupts for JUST THAT PART, and then you turn >>> interrupts back on. >>> >>> You have to accept that your interrupt latency will increase a bit (and >>> you want keep a stick handy, to whack any developer who wraps a >>> protection block around more than the absolute minimum of code). >>> >>> >> Turning off interrupts for a bit is often the simplest way, but just be >> /very/ careful that you really do it correctly. Something like: >> >> >> curr_s.data = data; >> curr_s.size = size; >> asm volatile("cpsid i"); >> curr = &curr_s; >> asm volatile("cpsie i"); >> >> >> is /not/ good enough. The compiler can freely move the writes to curr_s >> inside the critical zone (leading to unpleasant whacking), or it can >> move the write to curr outside the critical zone (leading to everything >> working perfectly until the customer demonstration, when it fails). It >> can even move the writes to curr_s after the write to curr, and after >> the critical zone - who knows what will happen then. >> >> >> Make sure you have memory barriers, such as these (for ARM Cortex-M >> devices): >> >> >> static inline void disableGlobalInterrupts(void) { >> asm volatile("cpsid i" ::: "memory"); >> } >> >> static inline void enableGlobalInterrupts(void) { >> asm volatile("cpsie i" ::: "memory"); >> } > > Something just went on my "to be fixed" queue. I assume that I've been > getting away with it by luck.
A great many people misunderstand about how memory accesses can be re-ordered in C. They are saved by several things: 1. If you use "volatile" generously, you will probably get the ordering you want. All volatile accesses (either because the variable is declared "volatile", or you use volatile casts for the access) are ordered with respect to each other. It is only non-volatile accesses that can change order. 2. Compilers don't usually move reads and writes around unless there is something to gain. If you have a fast superscaler processor with caches, it can happen a lot - in particular, compilers will often generate reads quite early so that the data is there in the register when it is actually needed. But for simpler, in-order processors, there is often no point in reading early or writing late, although the compiler may not bother writing out the data if it knows it will be writing new values to the variable shortly, and it has the free registers available. 3. Many developers don't really understand compiler settings and optimisations, and simply compile with optimisations disabled (how many times have you heard people say their code runs "correctly" when optimisations are disabled, but not when optimised?). Of course that means bigger and slower code - and in this branch, that can sometimes mean more expensive and power-hungry microcontrollers. And it also means code that may break on newer compilers - compilers are still free to optimise as they want despite your choice of compiler flags. "-O" settings are hints, not demands. 4. When you get this sort of thing wrong, you probably won't notice. Everything will work unless an interrupt hits at exactly the right point in the code, and the chances of that happening are typically small.
> > At any rate, to provide the full-meal-deal you also save the current > state of interrupts, so that you don't turn the turned-off interrupts > off, and then turn them on inappropriately. > > inline int disable_interrupts(void) > { > int primask_copy; > asm("mrs %[primask_copy], primask\n\t" // save interrupt status > "cpsid i\n\t" // disable interrupts > : [primask_copy] "=r" (primask_copy) > :: "memory"); > > return primask_copy; > } > > inline void enable_interrupts(int primask) > { > int primask_copy = primask; > > // Restore interrupts to their previous value > asm("msr primask, %[primask_copy]" : : [primask_copy] "r" (primask_copy) > : "memory"); > } > > Then in the code you write: > > int interrupt_state = disable_interrupts(); > > // Put your teeny bit of critical stuff here > > enable_interrupts(interrupt_state); > > I _hope_ that this works right for all cases -- I'm interested in what > David has to say about it.
That all looks fine. However, generally you don't need to save interrupt state - because you /know/ whether you are in a critical section or not. If you have written: foo(void) { // Do lots of stuff disableInterrupts(); // tiny critical section stuff enableInterrupts(); // Do lots more stuff } then the only reason you would want to make your "disableInterrupts/enableInterrupts" store and restore the previous value is if "foo" is sometimes being called from within an interrupt function or another critical section. And if that is happening, your interrupts will be blocked during "lots of stuff" and "lots more stuff" - your system is likely broken. However, sometimes you have functions that you want to call from within interrupts, or other situations where nested critical sections could occur - and in that acse, you want to track and restore the interrupt status.
> > (It's not guaranteed to compile -- I just added the "memory" stuff; I > need to go put it into my code) >
On 29/04/16 09:02, pozz wrote:
> Il 28/04/2016 22:26, David Brown ha scritto: >> On 28/04/16 15:15, pozz wrote: >>> I have an ISR that uses a struct with two members: >>> >>> struct mystruct { >>> const void *data; >>> size_t size; >>> } *curr, *next, curr_s, next_s; >>> >>> void ISR(void) { >>> static const unsigned char *d; >>> if (next) { >>> *curr = *next; >>> next = NULL; >>> d = curr->data; >>> } >>> if (curr) { >>> /* Process curr structure, byte by byte */ >>> ... >>> if (d - (const unsigned char *)curr->data == >>> curr->size) { >>> curr = NULL; >>> } >>> } >>> } >>> >>> Now, to activate a new data structure I wrote the following routine: >>> >>> void foo(const void *data, size_t size) { >>> if ((volatile struct mystruct *)curr == NULL) { >>> curr_s.data = data; >>> curr_s.size = size; >>> curr = &curr_s; >>> } else { >>> (*) next_s.data = data; >>> next_s.size = size; >>> next = &next_s; >>> } >>> } >>> >>> If a current data structure is present, it should be stopped and the new >>> one should take place. >>> >>> The first if guarantees that there isn't a current structure, so curr >>> and curr_s can be changed without any problem. >> >> Your code above does not guarantee that curr_s is set appropriately >> before curr is set. The compiler is free to store the address of curr_s >> in curr before setting curr_s.data and curr_s.size. > > You are right, I didn't thought about this situation. > > >> To force the desired ordering, you have two choices. You can make sure >> curr_s /and/ curr are both accessed as volatile (by using casts to >> volatile, or by simply declaring the variables as volatile). Or you can >> use a memory barrier to stop the movement of accesses back and forth: >> >> asm volatile("" ::: "memory"); >> >> (assuming gcc or a compiler that supports gcc inline assembly). >> >> Unless you are sure you know what you are doing with memory barriers, >> and with getting volatile casts in exactly the right places, it is >> usually easier to declare the relevant variables as volatile, at the >> risk of slightly suboptimal code generation. And that means /all/ the >> relevant variables - volatile ordering does not affect the ordering of >> non-volatile accesses. > > Does the following code fix that problem? > > void foo(const void *data, size_t size) { > volatile struct mystruct *c = (volatile struct mystruct *)curr; > if (c == NULL) { > curr_s.data = data; > curr_s.size = size; > c = &curr_s; > } else { > /* ... */ > } > } > > Now is the instruction "c = &curr_s;" *guaranteed* to be executed > *after* curr_s assignements? >
No - volatile accesses are only ordered with respect to other volatile accesses (and other observable or potentially observable events, such as calls to external functions). The stores to curr_s can be moved after the store to c, since the stores to curr_s are not volatile. (You could also add a memory barrier before "c = &curr_s;".)
> >>> The second if happens when a current structure really exists. In this >>> case, I can't change curr_s and curr directly, so I use a temporary data >>> structure next_s and an atomic flag next (I'm working on 32-bits >>> architecture, so the assignment is atomic). The next data will take >>> place in the next ISR. >>> >> >> Again, you need volatiles or a memory barrier to force the setting of >> next_s before the setting of next. > > Yes. > > >>> Now suppose current structure isn't present (curr == NULL) and I call >>> foo() three times: >>> >>> foo(d1, s1); >>> foo(d2, s2); >>> foo(d3, s3); >>> >>> I would like to have finally (d3,s3). After first foo(), the first if is >>> followed and curr_s data structure and curr flag is filled. >>> After second foo(), the second if is followed and next_s/next are >>> filled. >>> >>> What happens with the third foo()? >> >> If the ISR has not fired, then next will be filled with d3, s3. >> >>> What happens if the ISR is fired >>> exactly *after* line marked with (*) above? >> >> Trouble. >> >>> The ISR will use next_s data structure, but it contains uncoherent data >>> values: data points to d3, but size is s2. >>> >>> How to avoid this problem? >>> >>> Just to explain my problem. The ISR is a timer interrupt that generates >>> and audio stream at a sampling frequency on a DAC output. foo() can be >>> named audio_play(). >> >> There are two ways to handle this that I can think of. One method is to >> disable interrupts (or at least the timer interrupt) around the critical >> section. As long as you remember to use volatile accesses or memory >> barriers appropriately, that should be easy to get right (for >> convenience, your interrupt enable/disable macros should include memory >> barriers). > > Yes, disabling interrupts is the simplest solution, but I usually try to > find a more elegant solution... if any exists.
Often, simple /is/ elegant! Remember, it is extremely difficult to spot issues with this sort of thing in testing. You can examine generated assembly code - but perhaps at a later date, with slight changes to other code, compiler options, or compiler versions, the generated assembly no longer matches. So it is better to be safe than sorry, even if that means marginally sub-optimal code. Typically for this sort of thing, adding extra volatiles is not going to cause inefficiency - you are going to do these reads and writes anyway. And disabling interrupts for a few cycles is rarely a real problem - it can be much better than using many more cycles for a non-locking solution.
> > >> Other than that, you basically have a two entry FIFO queue with one >> reader client (the ISR) and one writer client (foo). This can be >> implemented as a 3 entry ring buffer with separate head and tail >> pointers. The writer only ever writes to the tail pointer and the >> buffer, the reader only ever writes to the head pointer. As each part >> can only be written by one task, you can keep them consistent. However, >> a simple implementation will only work if you drop new values when your >> FIFO is full - it gets messy if you want to be able to overwrite the >> final entry with new data. > > But I need to keep the last pushed data, not dropping it.
Then it gets complicated! Basically, you need a some indirection in order to be able to atomically swap the old data with the latest data.
> > What about TWO next data structures? When a foo() is called and current > structure is busy, the free next data structure is filled and the next > pointer is then assigned to the new data structure. > > > struct mystruct { > const void *data; > size_t size; > } *curr, *next, curr_s, next1_s, next2_s; > > void foo(const void *data, size_t size) { > volatile struct mystruct *c = (volatile struct mystruct *)curr; > volatile struct mystruct *n = (volatile struct mystruct *)next; > if (c == NULL) { > curr_s.data = data; > curr_s.size = size; > c = &curr_s; > } else if ((n == NULL) || (n == &next2_s)) { // Use next1 > next1_s.data = data; > next1_s.size = size; > n = &next1_s; > } else { // Use next2 > next2_s.data = data; > next2_s.size = size; > n = &next2_s; > } > } > > If volatile cast guarantees that the order of the three instructions > inside if branches isn't changed, I think this solution works in any case. >
With the note from earlier that your curr_s (and now next1_s and next2_s) need to be volatile, or accessed through volatile pointers, I think this is probably okay - but I have not studied it thoroughly. I would recommend you declare the variables here as volatile - you have nothing to lose in efficiency, and it will make the code clearer. I also recommend typedef'ing your struct and declaring variables separately - again, it will make the code clearer. But that's a style issue, not a correctness issue, and different people have different styles.
Il 29/04/2016 13:02, David Brown ha scritto:
> On 29/04/16 09:02, pozz wrote: >> Il 28/04/2016 22:26, David Brown ha scritto: >>> On 28/04/16 15:15, pozz wrote: >>>> I have an ISR that uses a struct with two members: >>>> >>>> struct mystruct { >>>> const void *data; >>>> size_t size; >>>> } *curr, *next, curr_s, next_s; >>>> >>>> void ISR(void) { >>>> static const unsigned char *d; >>>> if (next) { >>>> *curr = *next; >>>> next = NULL; >>>> d = curr->data; >>>> } >>>> if (curr) { >>>> /* Process curr structure, byte by byte */ >>>> ... >>>> if (d - (const unsigned char *)curr->data == >>>> curr->size) { >>>> curr = NULL; >>>> } >>>> } >>>> } >>>> >>>> Now, to activate a new data structure I wrote the following routine: >>>> >>>> void foo(const void *data, size_t size) { >>>> if ((volatile struct mystruct *)curr == NULL) { >>>> curr_s.data = data; >>>> curr_s.size = size; >>>> curr = &curr_s; >>>> } else { >>>> (*) next_s.data = data; >>>> next_s.size = size; >>>> next = &next_s; >>>> } >>>> } >>>> >>>> If a current data structure is present, it should be stopped and the new >>>> one should take place. >>>> >>>> The first if guarantees that there isn't a current structure, so curr >>>> and curr_s can be changed without any problem. >>> >>> Your code above does not guarantee that curr_s is set appropriately >>> before curr is set. The compiler is free to store the address of curr_s >>> in curr before setting curr_s.data and curr_s.size. >> >> You are right, I didn't thought about this situation. >> >> >>> To force the desired ordering, you have two choices. You can make sure >>> curr_s /and/ curr are both accessed as volatile (by using casts to >>> volatile, or by simply declaring the variables as volatile). Or you can >>> use a memory barrier to stop the movement of accesses back and forth: >>> >>> asm volatile("" ::: "memory"); >>> >>> (assuming gcc or a compiler that supports gcc inline assembly). >>> >>> Unless you are sure you know what you are doing with memory barriers, >>> and with getting volatile casts in exactly the right places, it is >>> usually easier to declare the relevant variables as volatile, at the >>> risk of slightly suboptimal code generation. And that means /all/ the >>> relevant variables - volatile ordering does not affect the ordering of >>> non-volatile accesses. >> >> Does the following code fix that problem? >> >> void foo(const void *data, size_t size) { >> volatile struct mystruct *c = (volatile struct mystruct *)curr; >> if (c == NULL) { >> curr_s.data = data; >> curr_s.size = size; >> c = &curr_s; >> } else { >> /* ... */ >> } >> } >> >> Now is the instruction "c = &curr_s;" *guaranteed* to be executed >> *after* curr_s assignements? >> > > No - volatile accesses are only ordered with respect to other volatile > accesses (and other observable or potentially observable events, such as > calls to external functions). The stores to curr_s can be moved after > the store to c, since the stores to curr_s are not volatile. > > (You could also add a memory barrier before "c = &curr_s;".)
In that case (adding a memory barrier before "c = &curr_s", is it necessary to declare curr_s and c variables as volatile?
>>>> The second if happens when a current structure really exists. In this >>>> case, I can't change curr_s and curr directly, so I use a temporary data >>>> structure next_s and an atomic flag next (I'm working on 32-bits >>>> architecture, so the assignment is atomic). The next data will take >>>> place in the next ISR. >>>> >>> >>> Again, you need volatiles or a memory barrier to force the setting of >>> next_s before the setting of next. >> >> Yes. >> >> >>>> Now suppose current structure isn't present (curr == NULL) and I call >>>> foo() three times: >>>> >>>> foo(d1, s1); >>>> foo(d2, s2); >>>> foo(d3, s3); >>>> >>>> I would like to have finally (d3,s3). After first foo(), the first if is >>>> followed and curr_s data structure and curr flag is filled. >>>> After second foo(), the second if is followed and next_s/next are >>>> filled. >>>> >>>> What happens with the third foo()? >>> >>> If the ISR has not fired, then next will be filled with d3, s3. >>> >>>> What happens if the ISR is fired >>>> exactly *after* line marked with (*) above? >>> >>> Trouble. >>> >>>> The ISR will use next_s data structure, but it contains uncoherent data >>>> values: data points to d3, but size is s2. >>>> >>>> How to avoid this problem? >>>> >>>> Just to explain my problem. The ISR is a timer interrupt that generates >>>> and audio stream at a sampling frequency on a DAC output. foo() can be >>>> named audio_play(). >>> >>> There are two ways to handle this that I can think of. One method is to >>> disable interrupts (or at least the timer interrupt) around the critical >>> section. As long as you remember to use volatile accesses or memory >>> barriers appropriately, that should be easy to get right (for >>> convenience, your interrupt enable/disable macros should include memory >>> barriers). >> >> Yes, disabling interrupts is the simplest solution, but I usually try to >> find a more elegant solution... if any exists. > > Often, simple /is/ elegant! > > Remember, it is extremely difficult to spot issues with this sort of > thing in testing. You can examine generated assembly code - but perhaps > at a later date, with slight changes to other code, compiler options, or > compiler versions, the generated assembly no longer matches. > > So it is better to be safe than sorry, even if that means marginally > sub-optimal code. Typically for this sort of thing, adding extra > volatiles is not going to cause inefficiency - you are going to do these > reads and writes anyway. And disabling interrupts for a few cycles is > rarely a real problem - it can be much better than using many more > cycles for a non-locking solution.
Yes, you're right again. If I will decide to disable timer interrupt in foo(), is it ok to avoid declaring variables (curr, next, curr_s, ...) with keyword volatile? I think yes, it should be ok. If interrupt (where the variables are potentially changed) is disabled, the variables aren't volatile.
>>> Other than that, you basically have a two entry FIFO queue with one >>> reader client (the ISR) and one writer client (foo). This can be >>> implemented as a 3 entry ring buffer with separate head and tail >>> pointers. The writer only ever writes to the tail pointer and the >>> buffer, the reader only ever writes to the head pointer. As each part >>> can only be written by one task, you can keep them consistent. However, >>> a simple implementation will only work if you drop new values when your >>> FIFO is full - it gets messy if you want to be able to overwrite the >>> final entry with new data. >> >> But I need to keep the last pushed data, not dropping it. > > Then it gets complicated! > > Basically, you need a some indirection in order to be able to atomically > swap the old data with the latest data. > >> >> What about TWO next data structures? When a foo() is called and current >> structure is busy, the free next data structure is filled and the next >> pointer is then assigned to the new data structure. >> >> >> struct mystruct { >> const void *data; >> size_t size; >> } *curr, *next, curr_s, next1_s, next2_s; >> >> void foo(const void *data, size_t size) { >> volatile struct mystruct *c = (volatile struct mystruct *)curr; >> volatile struct mystruct *n = (volatile struct mystruct *)next; >> if (c == NULL) { >> curr_s.data = data; >> curr_s.size = size; >> c = &curr_s; >> } else if ((n == NULL) || (n == &next2_s)) { // Use next1 >> next1_s.data = data; >> next1_s.size = size; >> n = &next1_s; >> } else { // Use next2 >> next2_s.data = data; >> next2_s.size = size; >> n = &next2_s; >> } >> } >> >> If volatile cast guarantees that the order of the three instructions >> inside if branches isn't changed, I think this solution works in any case. >> > > With the note from earlier that your curr_s (and now next1_s and > next2_s) need to be volatile, or accessed through volatile pointers, I > think this is probably okay - but I have not studied it thoroughly. > > I would recommend you declare the variables here as volatile - you have > nothing to lose in efficiency, and it will make the code clearer. > > I also recommend typedef'ing your struct and declaring variables > separately - again, it will make the code clearer. But that's a style > issue, not a correctness issue, and different people have different styles.
On 29/04/16 13:22, pozz wrote:
> Il 29/04/2016 13:02, David Brown ha scritto: >> On 29/04/16 09:02, pozz wrote:
<snip>
>>> Now is the instruction "c = &curr_s;" *guaranteed* to be executed >>> *after* curr_s assignements? >>> >> >> No - volatile accesses are only ordered with respect to other volatile >> accesses (and other observable or potentially observable events, such as >> calls to external functions). The stores to curr_s can be moved after >> the store to c, since the stores to curr_s are not volatile. >> >> (You could also add a memory barrier before "c = &curr_s;".) > > In that case (adding a memory barrier before "c = &curr_s", is it > necessary to declare curr_s and c variables as volatile? >
You don't need both - a memory barrier tells the compiler that anything that the source code says should be written before the barrier, will be written before the barrier, and that anything that the source code says should be written after the barrier, is written after the barrier. It also says that anything the compiler has read before the barrier should be discarded and re-read after the barrier if needed. So the memory barrier provides a synchronisation point for all memory accesses, volatile and non-volatile. Note that both "volatile" and compiler memory barriers (using empty inline assembly with a "memory clobber") are purely compiler-level concepts. They don't affect any scheduling or pipelining in the cpu, cache, or other hardware. <snip>
>>> Yes, disabling interrupts is the simplest solution, but I usually try to >>> find a more elegant solution... if any exists. >> >> Often, simple /is/ elegant! >> >> Remember, it is extremely difficult to spot issues with this sort of >> thing in testing. You can examine generated assembly code - but perhaps >> at a later date, with slight changes to other code, compiler options, or >> compiler versions, the generated assembly no longer matches. >> >> So it is better to be safe than sorry, even if that means marginally >> sub-optimal code. Typically for this sort of thing, adding extra >> volatiles is not going to cause inefficiency - you are going to do these >> reads and writes anyway. And disabling interrupts for a few cycles is >> rarely a real problem - it can be much better than using many more >> cycles for a non-locking solution. > > Yes, you're right again. > > If I will decide to disable timer interrupt in foo(), is it ok to avoid > declaring variables (curr, next, curr_s, ...) with keyword volatile? > > I think yes, it should be ok. If interrupt (where the variables are > potentially changed) is disabled, the variables aren't volatile. >
Again, you need either volatiles or memory barriers to make sure the required accesses (and preferably /only/ the required accesses) are within the critical section. An easy way to do this is to make memory barriers part of the enable/disable interrupt functions. But note that memory barriers are a heavy-handed approach - if your code must be as fast as possible, and your processor has lots of registers, long pipelines and super-scaler scheduling, then sometimes it is not ideal. For typical microcontrollers, however, it works very well. I've included a few of my favourite functions below. The assembly used is for the ARM Cortex-M4, but can easily be adapted for other microcontrollers (and simpler cpus like the AVR or MSP430 don't need any cpu barrier or flush instructions - but you still need the memory clobber!). One of these that is likely to be a new concept for many (I can imagine Tim peering intently at this one!) is the "forceDependency" macro below. This is a solution to problems such as : void setBaudrate(uint32_t baud) { uint32_t divisor = ((busClockTicksPerMicro * 1000000 * 2) + (baud / 2)) / baud; uint32_t sbr = divisor >> 5; UART_Type* pUart = UART2_BASE_PTR; uint8_t BDH = (sbr >> 8) & 0x1f; uint8_t BDL = sbr & 0xff; uint8_t C4 = divisor & 0x1f; disableInterrupts(); // Critical section starts here pUart->BDH = BDH; pUart->BDL = BDL; pUart->C4 = C4; enableInterrupts(); } First, "divisor" and "sbr" are calculated, which is a slow process involving multiple cycles and a library call or two. Then the desired hardware register values are derived. Then interrupts are disabled, and the values are written into the peripheral registers. It looks like interrupts are disabled for the minimal length of time. But... Even though the disableInterrupt() and enableInterrupt() functions have a memory barrier, /and/ the pUart pointer is to a volatile variable, those only apply to memory accesses. /Calculations/ can be moved around as the compiler fancies. The compiler may not bother calculating "divisor", "sbr", and so on, until they are actually needed - and that means /after/ the disableInterrupt() call. The result is that it is quite possible for all the long calculations to be done within the critical section. The solution is to put: forceDependency(BDH); forceDependency(BDL); forceDependency(C4); before disableInterrupt(). The "forceDependency" macro tells the compiler that it needs to know the value of "BDH" for some inline assembly - even though the inline assembly is empty and does nothing. I believe that is enough lessons for today. I hope my posts here have been of use to both the OP (well done for asking difficult and interesting questions) and to others in the group. mvh., David // Ensure that "val" has been calculated before next volatile access // by requiring it as an assembly input. Note that only volatiles are // ordered! #define forceDependency(val) \ asm volatile("" :: "" (val) : ) // Ensures that all outstanding memory transactions complete before // starting new ones static inline void memoryBarrier(void) { asm volatile("dmb" : : : "memory"); } // Compiler-only memory barrier static inline void compilerBarrier(void) { asm volatile("" ::: "memory"); } // Ensures that all outstanding memory transactions complete before // next instruction static inline void dsync(void) { asm volatile("dsb" : : : "memory"); } // Ensures that all outstanding memory transactions complete, then // flush instruction buffers static inline void isync(void) { asm volatile("isb" : : : "memory"); }

Memfault Beyond the Launch