From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx> smp_mb() uses lock;add for x86 in the linux kernel. Add information about the same. Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> Co-developed-by: Akira Yokosawa <akiyks@xxxxxxxxx> Signed-off-by: Akira Yokosawa <akiyks@xxxxxxxxx> --- Changes in v2 (by akiyks): - Apply punctuation conventions of perfbook LaTeX source. - Break lines at sentence-ending punctuation marks. - Overall wordsmith. - Fix typo in Subject. (implementation) - Drop confusing "the"s. - Use "lock;addl" for consistency in the section. - Reworded "instead of directly modifying SP" which surprised me a bit. - Reorder the final sentence to make it obvious that mb() is the one who uses mfence. --- memorder/memorder.tex | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/memorder/memorder.tex b/memorder/memorder.tex index 5c978fbef172..6b9c3268e589 100644 --- a/memorder/memorder.tex +++ b/memorder/memorder.tex @@ -6081,6 +6081,16 @@ A few older variants of the x86 CPU have a mode bit that enables out-of-order stores, and for these CPUs, \co{smp_wmb()} must also be defined to be \co{lock;addl}. +A 2017 kernel commit by Michael S.~Tsirkin replaced \co{mfence} with +\co{lock;addl} in \co{smp_mb()}, achieving a 60 percent performance +boost~\cite{Tsirkin2017}. +The change used a 4-byte negative offset from \co{SP} to avoid +slowness due to false data dependencies, instead of directly +accessing memory pointed to by \co{SP}. +\co{clflush} users still need to use \co{mfence} for ordering. +Therefore, they were converted to use \co{mb()}, which uses \co{mfence} +as before, instead of \co{smp_mb()}. + Although newer x86 implementations accommodate self-modifying code without any special instructions, to be fully compatible with past and potential future x86 implementations, a given CPU must -- 2.25.1