EmbeddedRelated.com
Forums

Cache memory vs volatile

Started by sadash 4 years ago9 replieslatest reply 4 years ago2132 views

Can you explain how does volatile prevents the compiler optimization 

on caching of variables? What does MMU do in this context?

[ - ]
Reply by mrfirmwareMarch 20, 2021

It's probably worth reading https://en.cppreference.com/w/c/language/volatile to understand volatile a bit more. My rule on volatile is if the compiler cannot detect another writer of a variable then I declare it volatile. Hardware registers that are memory mapped fall into this category. The HW register is updated by an external actor (the hardware that owns the register) and the compiler does not expect this. So I'd make the pointer to that hardware register point to volatile. The other thing you can use volatile for is forcing a sequence point in the code generation reordering that the compiler will do but it's probably better to insert a compiler barrier instead, e.g. __asm__ __volatile__("\n":::"memory");

HTH,

- Mark

[ - ]
Reply by matthewbarrMarch 20, 2021

Great answer Mark. Variables that can be written by non-interrupt code and by an interrupt service routine are another common situation where a volatile declaration is appropriate.

The volatile declaration tells the compiler that it is not safe to assume a value in, say a processor register from a previous write to a variable (memory address) still represents the current value of that variable. Hardware (caching, MMU operation) is unaware of volatile declarations.

[ - ]
Reply by mrfirmwareMarch 20, 2021

Matt to your point about the ISR/background volatile. I always declare shared vars between ISR and background code volatile but some colleagues have said that's not needed. I can imagine if the ISR is in another translation unit then the variable would be extern and the compiler could not make assumptions about the background use of said variable and so maybe volatile would not be required in that case? See below.

// isr.c
bool g_data_ready;
void isr(void) { if (!g_data_ready) { g_data_ready = 1; /* fill data */ } }

// background.c
int main(void)
{
    for (;;) {
        while (!g_data_ready) { // <-- Can compiler really optimize away?
            // process data
            g_data_ready = 0;
        }
    }
    return -1;
}

What if the flag variable and isr() were moved into background.c and we declared the flag variable static?

[ - ]
Reply by matthewbarrMarch 20, 2021

Indeed, it depends entirely on what the compiler knows about the variable and how it handles it. As the comment indicates in your code example, it would be unsafe to optimize away the data ready test in background.c.

A volatile declaration may not be the only way to get there, but like you, I use them in these situations. I would rather have a superfluous volatile declaration than a missing volatile declaration with accompanying bug!

I'm assuming you'd separately compile the two source files and link. The code example doesn't have a declaration for g_data_ready in background.c so it's hard to say what a compiler might do and I think it would complain about the missing declaration. (I think you mean "while (g_data_ready) ..." in background.c, or am I missing something?)

[ - ]
Reply by mrfirmwareMarch 20, 2021

Yes, sorry about that. I agree about just using volatile and being safe. My second case I should have looked like this:

static bool s_data_ready;
static void isr(void) { if (!s_data_ready) { s_data_ready = 1; /* fill data */ } }

int main(void)
{
    for (;;) {
        // <-- I think compiler can optimize away b/c no call to isr() detected
        while (!s_data_ready) { 
            // process data
            s_data_ready = 0;
        }
    }
    return -1;
}
[ - ]
Reply by sadashMarch 20, 2021

Post deleted by author

[ - ]
Reply by jmford94March 20, 2021

I see 2 separate questions here.  Variables declared as volatile are not cached in registers between operations, so that if something writes to it outside the current thread of execution, the next time the current thread of execution needs it it will be read from the memory again each time it is used.  Otherwise, the compiler will likely keep the variable in a register and access it without referring to memory.

The other question about MMUs relates more to hardware, where the cache coherent behavior ensures (hopefully!) that any access to memory in the system will cause page faults that fault in the new values and invalidate any caches in the hardware (CPUs mainly these days).  This is not affected by the compiler at all.


[ - ]
Reply by rmilneMarch 20, 2021

I'm not in the MMU world (though I guess embedded forums have changed over the years and MMUs are a now a topic) but the volatile attribute is quite common in my space (multi-core automotive).  Spin locks via intrinsic test-and-set instructions must be performed on volatiles.  These vars have to go into non-cached memory regions of the map.

[ - ]
Reply by sadashMarch 20, 2021

When I started exploring cache of variables without volatile, I found a sequence of reading i/p register.

Int regRead = P1IN; //reading input port using msp430 processor.


1. CPU looks for Cache memory for efficiency.

2. If it lost(cache miss), then it look for RAM.

3. In RAM, it finds Memory mapped I/O port register.

4. It cache the data from I/O port register address, then to cpu registers and store in "regRead" variable.

5. For continuous reading of io port register, then CPU uses the register value which was read before.

6. "volatile" qualifier comes into scope for this purpose. It makes the compiler to re-read the register value.

7. But, the point is whether it read from cache or RAM is unknown scope in this context.

8. So, for specific situation like this, two approaches has been discussed so far.

- we have to make sure the register is in uncached address space. 

That means every time you access the register you are guaranteed to read/write the actual hardware register and not cache memory. 

- A more complex but potentially better performing approach is to use cached address space and have the code manually force cache updates.

(from so- volatile variable and cache memory)

For both approaches, how this accomplished is architecture-dependent.

- Also before MMU comes into action, it will configure to not cache the value into register.

Whether my sequence correct?