Adding volatile to the g_counter tells the compiler not to optimize away
access to the variable, as the variable can change "at any time".
That may sound like just the thing for multi-threaded access to the same
variable -- but it's not sufficient.
Even though the compiler is no longer going to optimize away access
to the
variable, the hardware can still effectively isolate the two threads
from
one another, such that neither thread is aware that the g_counter
variable
has changed.
Why? Because each thread could run with the variable stored in separate
caches.