smp_mb() uses lock;add for x86 in the linux kernel. Add information about the same. Cc: paulmck@xxxxxxxxxx Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> --- Not even build tested, just focused on the content and to keep my promise I'd send this out (better than never sending it) ;-). I appreciate maintainers of perfbook taking this forward ;-). Thanks! bib/hw.bib | 8 ++++++++ memorder/memorder.tex | 8 ++++++++ 2 files changed, 16 insertions(+) diff --git a/bib/hw.bib b/bib/hw.bib index b0885e74..b1dfd119 100644 --- a/bib/hw.bib +++ b/bib/hw.bib @@ -1159,3 +1159,11 @@ Luis Stevens and Anoop Gupta and John Hennessy", note="\url{https://github.com/google/fuzzing/blob/master/docs/silifuzz.pdf}", } +@unpublished{Tsirkin2017, + Author="Michael S. Tsirkin", + Title="locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE", + month="November", + day="10", + year="2017", + note="\url{https://lore.kernel.org/all/tip-450cbdd0125cfa5d7bbf9e2a6b6961cc48d29730@xxxxxxxxxxxxxx/}", +} diff --git a/memorder/memorder.tex b/memorder/memorder.tex index 5c978fbe..b28ac4f0 100644 --- a/memorder/memorder.tex +++ b/memorder/memorder.tex @@ -6081,6 +6081,14 @@ A few older variants of the x86 CPU have a mode bit that enables out-of-order stores, and for these CPUs, \co{smp_wmb()} must also be defined to be \co{lock;addl}. +A 2017 kernel commit by Michael S. Tsirkin replaced \co{mfence} with +\co{lock add} in \co{smp_mb()}, achieving a 60 percent performance +boost~\cite{Tsirkin2017}. The change used a 4-byte negative offset from +the \co{SP} to avoid slowness due to false data-dependencies, +instead of directly modifying the \co{SP}. \co{clflush} users still +need to use \co{mfence} for ordering, so they have been converted to use +\co{mb} instead of \co{smp_mb}, which uses an \co{mfence} as before. + Although newer x86 implementations accommodate self-modifying code without any special instructions, to be fully compatible with past and potential future x86 implementations, a given CPU must -- 2.42.0.655.g421f12c284-goog