mb() typically uses mfence on modern x86, but a micro-benchmark shows that it's 2 to 3 times slower than lock; addl $0,(%%e/rsp) that we use on older CPUs. So let's use the locked variant everywhere - helps keep the code simple as well. While I was at it, I found some inconsistencies in comments in arch/x86/include/asm/barrier.h I hope I'm not splitting this up too much - the reason is I wanted to isolate the code changes (that people might want to test for performance) from comment changes approved by Linus, from (so far unreviewed) comment change I came up with myself. Lightly tested on my system. Michael S. Tsirkin (3): x86: drop mfence in favor of lock+addl x86: drop a comment left over from X86_OOSTORE x86: tweak the comment about use of wmb for IO arch/x86/include/asm/barrier.h | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) -- MST _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization