Re: [PATCH 1/6] MIPS: Octeon: Implement __smp_store_release()

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Wed, 27 Jan 2021 23:32:03 +0100

On Wed, Jan 27, 2021 at 09:36:22PM +0100, Alexander A Sverdlin wrote:
> From: Alexander Sverdlin <alexander.sverdlin@xxxxxxxxx>
> 
> On Octeon smp_mb() translates to SYNC while wmb+rmb translates to SYNCW
> only. This brings around 10% performance on tight uncontended spinlock
> loops.
> 
> Refer to commit 500c2e1fdbcc ("MIPS: Optimize spinlocks.") and the link
> below.
> 
> On 6-core Octeon machine:
> sysbench --test=mutex --num-threads=64 --memory-scope=local run
> 
> w/o patch:	1.60s
> with patch:	1.51s
> 
> Link: https://lore.kernel.org/lkml/5644D08D.4080206@xxxxxxxxxxxxxxxxxx/
> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@xxxxxxxxx>
> ---
>  arch/mips/include/asm/barrier.h | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
> index 49ff172..24c3f2c 100644
> --- a/arch/mips/include/asm/barrier.h
> +++ b/arch/mips/include/asm/barrier.h
> @@ -113,6 +113,15 @@ static inline void wmb(void)
>  					    ".set arch=octeon\n\t"	\
>  					    "syncw\n\t"			\
>  					    ".set pop" : : : "memory")
> +
> +#define __smp_store_release(p, v)					\
> +do {									\
> +	compiletime_assert_atomic_type(*p);				\
> +	__smp_wmb();							\
> +	__smp_rmb();							\
> +	WRITE_ONCE(*p, v);						\
> +} while (0)

This is wrong in general since smp_rmb() will only provide order between
two loads and smp_store_release() is a store.

If this is correct for all MIPS, this needs a giant comment on exactly
how that smp_rmb() makes sense here.