How to write a simple driver in bare metal systems: volatile, memory barrier, critical sections and so on| page 4

Reply by Niklas Holsti ●October 25, 20212021-10-25

On 2021-10-25 16:04, Dimiter_Popoff wrote:
> On 10/25/2021 11:09, Niklas Holsti wrote:
>> On 2021-10-24 23:27, Dimiter_Popoff wrote:
>>> On 10/24/2021 22:54, Don Y wrote:
>>>> On 10/24/2021 4:14 AM, Dimiter_Popoff wrote:
>>>>>> Disable interrupts while accessing the fifo. you really have to.
>>>>>> alternatively you'll often get away not using a fifo at all,
>>>>>> unless you're blocking for a long while in some part of the code.
>>>>>
>>>>> Why would you do that. The fifo write pointer is only modified by
>>>>> the interrupt handler, the read pointer is only modified by the
>>>>> interrupted code. Has been done so for times immemorial.
>>>>
>>>> The OPs code doesn't differentiate between FIFO full and empty.
>>>
>>> So he should fix that first, there is no sane reason why not.
>>> Few things are simpler to do than that.
>>
>>
>> &#4294967295;&#4294967295;&#4294967295; [snip]
>>
>>
>>> Whatever handshakes he makes there is no problem knowing whether
>>> the fifo is full - just check if the position the write pointer
>>> will have after putting the next byte matches the read pointer
>>> at the moment.&#4294967295; Like I said before, few things are simpler than
>>> that, can't imagine someone working as a programmer being
>>> stuck at *that*.
>>
>> That simple check would require keeping a maximum of only N-1 entries 
>> in the N-position FIFO buffer, and the OP explicitly said they did not 
>> want to allocate an unused place in the buffer (which I think is 
>> unreasonable of the OP, but that is only IMO).
> 
> Well it might be reasonable if the fifo has a size of two, you know :-).


And if each of those two items is large, yes. But here we have a FIFO of 
8-bit characters... few programs are so tight on memory that they cannot 
stand one unused octet.

Reply by Don Y ●October 25, 20212021-10-25

On 10/25/2021 8:34 AM, Niklas Holsti wrote:
> And if each of those two items is large, yes. But here we have a FIFO of 8-bit 
> characters... few programs are so tight on memory that they cannot stand one 
> unused octet.

It's not "unused".  Rather, it's roll is that of indicating "full/overrun".
The OP seems to have decided that this is of no concern -- in *one* app?

Reply by pozz ●October 25, 20212021-10-25

Il 25/10/2021 17:34, Niklas Holsti ha scritto:
> On 2021-10-25 16:04, Dimiter_Popoff wrote:
>> On 10/25/2021 11:09, Niklas Holsti wrote:
>>> On 2021-10-24 23:27, Dimiter_Popoff wrote:
>>>> On 10/24/2021 22:54, Don Y wrote:
>>>>> On 10/24/2021 4:14 AM, Dimiter_Popoff wrote:
>>>>>>> Disable interrupts while accessing the fifo. you really have to.
>>>>>>> alternatively you'll often get away not using a fifo at all,
>>>>>>> unless you're blocking for a long while in some part of the code.
>>>>>>
>>>>>> Why would you do that. The fifo write pointer is only modified by
>>>>>> the interrupt handler, the read pointer is only modified by the
>>>>>> interrupted code. Has been done so for times immemorial.
>>>>>
>>>>> The OPs code doesn't differentiate between FIFO full and empty.
>>>>
>>>> So he should fix that first, there is no sane reason why not.
>>>> Few things are simpler to do than that.
>>>
>>>
>>> &#4294967295;&#4294967295;&#4294967295; [snip]
>>>
>>>
>>>> Whatever handshakes he makes there is no problem knowing whether
>>>> the fifo is full - just check if the position the write pointer
>>>> will have after putting the next byte matches the read pointer
>>>> at the moment.&#4294967295; Like I said before, few things are simpler than
>>>> that, can't imagine someone working as a programmer being
>>>> stuck at *that*.
>>>
>>> That simple check would require keeping a maximum of only N-1 entries 
>>> in the N-position FIFO buffer, and the OP explicitly said they did 
>>> not want to allocate an unused place in the buffer (which I think is 
>>> unreasonable of the OP, but that is only IMO).
>>
>> Well it might be reasonable if the fifo has a size of two, you know :-).
> 
> 
> And if each of those two items is large, yes. But here we have a FIFO of 
> 8-bit characters... few programs are so tight on memory that they cannot 
> stand one unused octet.

When I have a small (<256) power-of-two (16, 32, 64, 128) buffer (and 
this is the case for a UART receiving ring-buffer), I like to use this 
implementation that works and doesn't waste any element.

However I know this isn't the best implementation ever and it's a pity 
the thread emphasis has been against this implementation (that was used 
as *one* implementation just to have an example to discuss on).

The main point was the use of volatile (and other techniques) to 
guarantee a correct compiler output, whatever legal (respect the C 
standard) optimizations the compiler thinks to do.

It seems to me the arguments againts or for volatile are completely 
indipendent from the implementation of ring-buffer.

Reply by Dimiter_Popoff ●October 25, 20212021-10-25

On 10/25/2021 20:43, Don Y wrote:
> On 10/25/2021 8:34 AM, Niklas Holsti wrote:
>> And if each of those two items is large, yes. But here we have a FIFO 
>> of 8-bit characters... few programs are so tight on memory that they 
>> cannot stand one unused octet.
> 
> It's not "unused".&#4294967295; Rather, it's roll is that of indicating "full/overrun".
> The OP seems to have decided that this is of no concern -- in *one* app?

Oh come on, I joked about the fifo of two bytes only because this whole
thread is a joke - pages and pages of C to maintain a fifo, what can be
more of a joke than this.

Reply by Don Y ●October 25, 20212021-10-25

On 10/25/2021 10:53 AM, Dimiter_Popoff wrote:
> On 10/25/2021 20:43, Don Y wrote:
>> On 10/25/2021 8:34 AM, Niklas Holsti wrote:
>>> And if each of those two items is large, yes. But here we have a FIFO of 
>>> 8-bit characters... few programs are so tight on memory that they cannot 
>>> stand one unused octet.
>>
>> It's not "unused".  Rather, it's roll is that of indicating "full/overrun".
>> The OP seems to have decided that this is of no concern -- in *one* app?
> 
> Oh come on, I joked about the fifo of two bytes only because this whole
> thread is a joke

My comment applies regardless of the size of the FIFO.

> - pages and pages of C to maintain a fifo, what can be
> more of a joke than this.

Where do you see "pages and pages of C to maintain a FIFO"?

Reply by Don Y ●October 25, 20212021-10-25

On 10/25/2021 10:52 AM, pozz wrote:
> However I know this isn't the best implementation ever and it's a pity the 
> thread emphasis has been against this implementation (that was used as *one* 
> implementation just to have an example to discuss on).

The point is that you need a COMPLETE implementation before you start
thinking about the amount of "license" the compiler can take with your code.

Here's *part* of an implementation:

       a = 37;

Now, should I declare A as volatile? Use the register qualifier?
What size should the integer A be?  Can the optimizer elide this
statement from my code?

All sorts of questions whose answers depend on the REST of the
implementation -- not shown!

> The main point was the use of volatile (and other techniques) to guarantee a 
> correct compiler output, whatever legal (respect the C standard) optimizations 
> the compiler thinks to do.
> 
> It seems to me the arguments againts or for volatile are completely indipendent 
> from the implementation of ring-buffer.

It has to do with indicating how YOU (the developer) see the object
being used (accessed).  You, in theory, know more about the role of
the object than the compiler (because it may be accessed in other modules,
or, have "stuff" tied to it -- like special hardware, etc.)  You need a way
to tell the compiler that "you know what you are doing" in your use
of the object and that it should restrain itself from making assumptions
that might not be true.

If your example doesn't bring to light those various issues, then
the decision as to its applicability is moot.

Reply by pozz ●October 25, 20212021-10-25

Il 23/10/2021 18:09, David Brown ha scritto:
[...]
> Marking "in" and "buf" as volatile is /far/ better than using a critical
> section, and likely to be more efficient than a memory barrier.  You can
> also use volatileAccess rather than making buf volatile, and it is often
> slightly more efficient to cache volatile variables in a local variable
> while working with them.

I think I got your point, but I'm wondering why there are plenty of 
examples of ring-buffer implementations that don't use volatile at all, 
even if the author explicitly refers to interrupts and multithreading.

Just an example[1] by Quantum Leaps. It promises to be a *lock-free* (I 
think thread-safe) ring-buffer implementation in the scenario of single 
producer/single consumer (that is my scenario too).

In the source code there's no use of volatile. I could call 
RingBuf_put() in my rx uart ISR and call RingBuf_get() in my mainloop code.

 From what I learned from you, this code usually works, but the standard 
doesn't guarantee it will work with every old, current and future compilers.



[1] https://github.com/QuantumLeaps/lock-free-ring-buffer

Reply by Niklas Holsti ●October 25, 20212021-10-25

On 2021-10-25 20:52, pozz wrote:
> Il 25/10/2021 17:34, Niklas Holsti ha scritto:
>> On 2021-10-25 16:04, Dimiter_Popoff wrote:
>>> On 10/25/2021 11:09, Niklas Holsti wrote:
>>>> On 2021-10-24 23:27, Dimiter_Popoff wrote:
>>>>> On 10/24/2021 22:54, Don Y wrote:
>>>>>> On 10/24/2021 4:14 AM, Dimiter_Popoff wrote:
>>>>>>>> Disable interrupts while accessing the fifo. you really have to.
>>>>>>>> alternatively you'll often get away not using a fifo at all,
>>>>>>>> unless you're blocking for a long while in some part of the code.
>>>>>>>
>>>>>>> Why would you do that. The fifo write pointer is only modified by
>>>>>>> the interrupt handler, the read pointer is only modified by the
>>>>>>> interrupted code. Has been done so for times immemorial.
>>>>>>
>>>>>> The OPs code doesn't differentiate between FIFO full and empty.

(I suspect something is not quite right with the attributions of the 
quotations above -- Dimiter probably did not suggest disabling 
interrupts -- but no matter.)

    [snip]

> When I have a small (<256) power-of-two (16, 32, 64, 128) buffer (and 
> this is the case for a UART receiving ring-buffer), I like to use this 
> implementation that works and doesn't waste any element.
> 
> However I know this isn't the best implementation ever and it's a pity 
> the thread emphasis has been against this implementation (that was used 
> as *one* implementation just to have an example to discuss on).
> 
> The main point was the use of volatile (and other techniques) to 
> guarantee a correct compiler output, whatever legal (respect the C 
> standard) optimizations the compiler thinks to do.
> 
> It seems to me the arguments againts or for volatile are completely 
> indipendent from the implementation of ring-buffer.

Of course "volatile" is needed, in general, whenever anything is written 
in one thread and read in another. The issue, I think, is when 
"volatile" is _enough_.

I feel that detection of a full buffer (FIFO overflow) is required for a 
proper ring buffer implementation, and that has implications for the 
data structure needed, and that has implications for whether critical 
sections are needed.

If the FIFO implementation is based on just two pointers (read and 
write), and each pointer is modified by just one of the two threads 
(main thread = reader, and interrupt handler = writer), and those 
modifications are both "volatile" AND atomic (which has not been 
discussed so far, IIRC...), then one can do without a critical region. 
But then detection of a full buffer needs one "wasted" element in the 
buffer.

To avoid the wasted element, one could add a "full"/"not full" Boolean 
flag. But that flag would be modified by both threads, and should be 
modified atomically together with the pointer modifications, which (I 
think) means that a critical section is needed.

Reply by David Brown ●October 25, 20212021-10-25

On 25/10/2021 20:15, pozz wrote:
> Il 23/10/2021 18:09, David Brown ha scritto:
> [...]
>> Marking "in" and "buf" as volatile is /far/ better than using a critical
>> section, and likely to be more efficient than a memory barrier.&#4294967295; You can
>> also use volatileAccess rather than making buf volatile, and it is often
>> slightly more efficient to cache volatile variables in a local variable
>> while working with them.
> 
> I think I got your point, but I'm wondering why there are plenty of
> examples of ring-buffer implementations that don't use volatile at all,
> even if the author explicitly refers to interrupts and multithreading.

You don't have to use "volatile".  You can make correct code here using
critical sections - it's just a lot less efficient.  (If you have a
queue where more than one context can be reading it or writing it, then
you /do/ need some kind of locking mechanism.)

You can also use memory barriers instead of volatile, but it is likely
to be slightly less efficient.

You can also use atomics instead of volatiles, but it is also quite
likely to be slightly less efficient.  If you have an SMP system, on the
other hand, then you need something more than volatile and compiler
memory barriers - atomics are quite possibly the most efficient solution
in that case.

And sometimes you can make code that doesn't need any special treatment
at all, because you know the way it is being called.  If the two ends of
your buffer are handled by tasks in a cooperative multi-tasking
scenario, then there is no problem - you don't need to worry about
volatile or any alternatives.  If you know your interrupt can't occur
while the other end of the buffer is being handled, that can reduce your
need for volatile.  (In particular, that can also avoid complications if
you have counter variables that are bigger than the processor can handle
atomically - usually not a problem for a 32-bit Cortex-M, but often
important on an 8-bit AVR.)

If you know, for a fact, that the code will be compiled by a weak
compiler or with weak optimisation, or that the "get" and "put"
implementations will always be in a separately compiled unit from code
calling these functions and you'll never use any kind of cross-unit
optimisations, then you can get often away without using volatile.

> 
> Just an example[1] by Quantum Leaps. It promises to be a *lock-free* (I
> think thread-safe) ring-buffer implementation in the scenario of single
> producer/single consumer (that is my scenario too).

It's lock-free, but not safe in the face of modern optimisation (gcc has
had LTO for many years, and a lot of high-end commercial embedded
compilers have used such techniques for decades).  And I'd want to study
it in detail and think a lot before accepting that it is safe to use its
16-bit counters on an 8-bit AVR.  That could be fixed by just changing
the definition of the RingBufCtr type, which is a nice feature in the code.

> 
> In the source code there's no use of volatile. I could call
> RingBuf_put() in my rx uart ISR and call RingBuf_get() in my mainloop code.
> 

You don't want to call functions from an ISR if you can avoid it, unless
the functions are defined in the same unit and can be inlined.  On many
processors (less so on the Cortex-M) calling an external function from
an ISR means a lot of overhead to save and restore the so-called
"volatile" registers (no relation to the C keyword "volatile"), usually
completely unnecessarily.

> From what I learned from you, this code usually works, but the standard
> doesn't guarantee it will work with every old, current and future
> compilers.
> 

Yes, that's a fair summary.

It might be good enough for some purposes.  But since "volatile" will
cost nothing in code efficiency but greatly increase the portability and
safety of the code, I'd recommend using it.  And I am certainly in
favour of thinking carefully about these things - as you did in the
first place, which is why we have this thread.

> 
> 
> [1] https://github.com/QuantumLeaps/lock-free-ring-buffer

Reply by David Brown ●October 25, 20212021-10-25

On 25/10/2021 17:34, Niklas Holsti wrote:

> And if each of those two items is large, yes. But here we have a FIFO of
> 8-bit characters... few programs are so tight on memory that they cannot
> stand one unused octet.

I remember a program I worked with where the main challenge for the
final features was not figuring out the implementation, but finding a
few spare bytes of code space and a couple of spare bits of ram to use.
 And that was with 32 KB ROM and 512 bytes RAM (plus some bits in the
registers of peripherals that weren't used).  That was probably the last
big assembly program I wrote - non-portability was a killer.

Previous 2 345 6 Next

How to write a simple driver in bare metal systems: volatile, memory barrier, critical sections and so on

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group