Volatile Variables in C and C++
Page Contents
References
- Volatiles Are Miscompiled, and What to Do about It, Eric Eide et al.
- Is A Global Implicitly Volatile In C, Stackoverflow.com.
- Compilers - What Every Programmer Should Know About Compiler Optimizations, Hadi Brais, Feb 2015.
- Compilers - What Every Programmer Should Know About Compiler Optimizations, Part 2, Hadi Brais, May 2015.
- A Guide to Undefined Behavior in C and C++, Part 3, John Regehr.
- When is a Volatile Object Accessed?, GCC manual.
- Memory Ordering at Compile Time, Jun 25 2012.
- The Joys of Compiler and Processor Reordering, Microsoft Blog, March 2008.
- Instruction scheduling.
- The Trouble With Volatile, May 2007, LWN.
Types
The C standard has this to say about the voltile
keyword.
An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects.
So what? Why does the compiler care? The reason is that the compiler is free to re-organise and change your code during optimization as long as the visible result is the same.
The Compiler Design Handbook by Y.N. Srikant et al has this to say about compiler optimisations.
Ever since the advent of reduced instruction set computers ... instruction scheduling techniques have gained importance as they rearrange instructions to "cover" the delay or latency that is required between an instruction and its dependent successor. Without such reordering, pipelines would stall, resulting in wasted processor cycles...
... Instruction scheduling methods for basic blocks may result in a moderate imporovement (less that 5 to 10%) in performance, in terms of execution time of a schedule, for simple pipelined RISC architectires. However, the performance improvement achieved for multiple instruction issue processors could be significant...
Instruction scheduling is typically performed after machine-indpendent optimizations ... on the target machine's assembly code...
It is important to note that volatile
only stops the compiler optimising lines that access
volatile
objects. It does not imply that the object is in non-cacheable memory, or
that caches are invalidated before it is read or anything like this!
It should also be noted that whilst a volatile
access will not moved w.r.t. to
other volatile
accesses, non-volatile accesses can be re-ordered around them.
When/How The Compiler Optimizes
Using the super amazing Godbold compiler explorer, compiling using GCC for arm at opimisation level 3, we can explore how volatile works on one of the most simple optimisations.
The Compiler Can Assume A Variable Stays Constant
In the following example the compiler can see that the variable a
is not modified inside the while
loop. It assumes a single flow of execution and so can see that in the while
loop the expression a == 1
will always evaluate to the same boolean value within the loop. Thus, to save compatuational time, it does not need to recalculate this expression and worse, have to branch, for each iteration of the loop. It can just to this once before the loop runs and then either execute an empty loop or return.
int a = 1; void f(void) { while(1) { if (a == 1) break; } }
f: adrp x0, .LANCHOR0 ldr w0, [x0, #:lo12:.LANCHOR0] cmp w0, 1 beq .L1 .L3: b .L3 .L1: ret a: .word 1
# The assembler equivalent is this: int a = 1; void f(void) { if (a == 1) goto L1; while(1) { } L1: }
You can see the above code on Godbolt.
Using Volatile To Remove Compiler's Ability To Assume Constantness
By marking the global variable from the previous example as volatile
the compiler can not assume anything about the state
of the variable w.r.t. the last statement executed. Thus, it is not free to move the varialble outside of the loop as on each evaulation it can no longer assume that the value is the same.
volatile int a = 1; void f(void) { while(1) { if (a == 1) break; } }
f: adrp x1, .LANCHOR0 add x1, x1, :lo12:.LANCHOR0 .L2: ldr w0, [x1] cmp w0, 1 bne .L2 ret a: .word 1
You can see the above code on Godbolt.
Using A Memory Barrier To Remove Compiler's Ability To Assume Constantness
This one was brought to my attention when I was reading up about the view that the Linux kernel community takes to the volatile
keyword within the kernel and in another situation, using volatile
to share data between threads. The same effect can be produced, in the above example, by using a "memory barrier" that forces the compiler to assume that registers are "dirty" and that objects must be reloaded from memory. Doing so means that the object must be reloaded on each loop iteration, so again, the compiler cannot hoist it outof the loop.
Why might we want to do this?
int a = 1; void f(void) { while(1) { asm volatile("": : :"memory"); if (a == 1) break; } }
f: adrp x1, .LANCHOR0 add x1, x1, :lo12:.LANCHOR0 .L2: ldr w0, [x1] cmp w0, 1 bne .L2 ret a: .word 1
You can see the above code on Godbolt.
Using A Function Call To Remove Compiler's Ability To Assume Constantness
By placing a function call, to a function in a different translation unit (the compiler can't see into unless cross module optimisations are being done) before the evaluation of the conditional, it can not assume that the state of a
has not been modified as a side effect of the function call.
int a = 1; extern void something(void); void f(void) { while(1) { something(); if (a == 1) break; } }
f: stp x29, x30, [sp, -32]! mov x29, sp str x19, [sp, 16] adrp x19, .LANCHOR0 add x19, x19, :lo12:.LANCHOR0 .L2: bl something ldr w0, [x19] cmp w0, 1 bne .L2 ldr x19, [sp, 16] ldp x29, x30, [sp], 32 ret a: .word 1
You can see the above code on Godbolt.
Presumably, if global (cross-compilation-unit) optimisation is turned on this wouldn't necessarily work and the either the called function or the callee would also have to use a memory barrier.