Re: [PATCH] memorder: Add info on recent x86 implemenation of smp_mb()

Akira Yokosawa <akiyks@xxxxxxxxx> · Sat, 14 Oct 2023 12:07:01 +0900

Hi Joel,

On 2023/10/13 10:22, Joel Fernandes (Google) wrote:
> smp_mb() uses lock;add for x86 in the linux kernel. Add information
> about the same.
> 
> Cc: paulmck@xxxxxxxxxx
> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> ---
> Not even build tested, just focused on the content and to keep my promise I'd
> send this out (better than never sending it) ;-). I appreciate maintainers of
> perfbook taking this forward ;-). Thanks!

I've just tested this...
And it failed to build.

I think I'll post a v2 which will build, with some wordsmithing
I can think of.

A few quick comments below.

> 
>  bib/hw.bib            | 8 ++++++++

bib/memorymodel.bib looks like a suitable destination.

>  memorder/memorder.tex | 8 ++++++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/bib/hw.bib b/bib/hw.bib
> index b0885e74..b1dfd119 100644
> --- a/bib/hw.bib
> +++ b/bib/hw.bib
> @@ -1159,3 +1159,11 @@ Luis Stevens and Anoop Gupta and John Hennessy",
>   note="\url{https://github.com/google/fuzzing/blob/master/docs/silifuzz.pdf}";,
>  }
>  
> +@unpublished{Tsirkin2017,
> + Author="Michael S. Tsirkin",
> + Title="locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE",
"_" in title needs an escape.

> + month="November",
> + day="10",
> + year="2017",
> + note="\url{https://lore.kernel.org/all/tip-450cbdd0125cfa5d7bbf9e2a6b6961cc48d29730@xxxxxxxxxxxxxx/}";,
> +}
> diff --git a/memorder/memorder.tex b/memorder/memorder.tex
> index 5c978fbe..b28ac4f0 100644
> --- a/memorder/memorder.tex
> +++ b/memorder/memorder.tex
> @@ -6081,6 +6081,14 @@ A few older variants of the x86 CPU have a mode bit that enables out-of-order
>  stores, and for these CPUs, \co{smp_wmb()} must also be defined to
>  be \co{lock;addl}.
>  
> +A 2017 kernel commit by Michael S. Tsirkin replaced \co{mfence} with
> +\co{lock add} in \co{smp_mb()}, achieving a 60 percent performance
> +boost~\cite{Tsirkin2017}. The change used a 4-byte negative offset from
                           ^
perfbook's LaTeX source convention needs a line break at the end of a
sentence.

> +the \co{SP} to avoid slowness due to false data-dependencies,
> +instead of directly modifying the \co{SP}. \co{clflush} users still
> +need to use \co{mfence} for ordering, so they have been converted to use
> +\co{mb} instead of \co{smp_mb}, which uses an \co{mfence} as before.
> +
>  Although newer x86 implementations accommodate self-modifying code
>  without any special instructions, to be fully compatible with
>  past and potential future x86 implementations, a given CPU must

Anyway, please wait for my v2.

        Thanks, Akira