count_stat_eventual.c uses a single global variable (stopflag) to synchronize two separate threads (main and eventual) probably running on two different CPU cores. My understanding is that for this model, memory fence instruction (mfence) which typically are expensive can be avoided, only if we make sure (1) that compiler neither reorder the code around nor cache the variable, and (2) that the CPU core running code precedes in program order. We use keyword ``volatile'' to prevent compilers from reordering the code and caching the variable, and instruction barrier() to force CPU core running the code to obey program order. Signed-off-by: Junchang Wang <junchangwang@xxxxxxxxx> --- CodeSamples/count/count_stat_eventual.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/CodeSamples/count/count_stat_eventual.c b/CodeSamples/count/count_stat_eventual.c index 059ab8b..5ffc4aa 100644 --- a/CodeSamples/count/count_stat_eventual.c +++ b/CodeSamples/count/count_stat_eventual.c @@ -23,7 +23,7 @@ DEFINE_PER_THREAD(unsigned long, counter); unsigned long global_count; -int stopflag; +volatile int stopflag; void inc_count(void) { @@ -47,7 +47,7 @@ void *eventual(void *arg) ACCESS_ONCE(global_count) = sum; poll(NULL, 0, 1); if (stopflag) { - smp_mb(); + barrier(); stopflag++; } } @@ -67,9 +67,9 @@ void count_init(void) void count_cleanup(void) { stopflag = 1; + barrier(); while (stopflag < 3) poll(NULL, 0, 1); - smp_mb(); } #include "counttorture.h" -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe perfbook" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html