Hi, My understanding about arch/x86/include/asm/barrier.h is: obviously Linux more likes {L,S,M}FENCE -- Locked ADD is only used in x86_32 platforms that don't support XMM2. However, it looks people say Locked Add is much faster than the FENCE instructions, even on modern Intel CPUs like Haswell, e.g., please see the three sources: " 11.5.1 Locked Instructions as Memory Barriers Optimization Use locked instructions to implement Store/Store and Store/Load barriers. " http://support.amd.com/TechDocs/47414_15h_sw_opt_guide.pdf "lock addl %(rsp), 0 is a better solution for StoreLoad barrier ": http://shipilev.net/blog/2014/on-the-fence-with-dependencies/ "...locked instruction are more efficient barriers...": http://www.pvk.ca/Blog/2014/10/19/performance-optimisation-~-writing-an-essay/ I also found that FreeBSD prefers Locked Add. So, I'm curious why Linux prefers MFENCE. I guess I may be missing something. I tried to google the question, but didn't find an answer. Thanks, -- Dexuan -- To unsubscribe from this list: send the line "unsubscribe linux-x86_64" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html