Re: [PATCH 17/19] LoongArch: Add multi-processor (SMP) support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 06, 2021 at 12:18:18PM +0800, Huacai Chen wrote:
> +#define __smp_load_acquire(p)							\
> +({										\
> +	union { typeof(*p) __val; char __c[1]; } __u;				\
> +	unsigned long __tmp = 0;							\
> +	compiletime_assert_atomic_type(*p);					\
> +	switch (sizeof(*p)) {							\
> +	case 1:									\
> +		*(__u8 *)__u.__c = *(volatile __u8 *)p;				\
> +		__smp_mb();							\
> +		break;								\
> +	case 2:									\
> +		*(__u16 *)__u.__c = *(volatile __u16 *)p;			\
> +		__smp_mb();							\
> +		break;								\
> +	case 4:									\
> +		__asm__ __volatile__(						\
> +		"amor.w %[val], %[tmp], %[mem]	\n"				\
> +		: [val] "=&r" (*(__u32 *)__u.__c)				\
> +		: [mem] "ZB" (*(u32 *) p), [tmp] "r" (__tmp)			\
> +		: "memory");							\
> +		break;								\
> +	case 8:									\
> +		__asm__ __volatile__(						\
> +		"amor.d %[val], %[tmp], %[mem]	\n"				\
> +		: [val] "=&r" (*(__u64 *)__u.__c)				\
> +		: [mem] "ZB" (*(u64 *) p), [tmp] "r" (__tmp)			\
> +		: "memory");							\
> +		break;								\
> +	default:								\
> +		barrier();							\
> +		__builtin_memcpy((void *)__u.__c, (const void *)p, sizeof(*p));	\
> +		__smp_mb();							\

smp_load_acquire() is explicitly not defined on longer than machine word
sizes.

> +	}									\
> +	__u.__val;								\
> +})

By using that cute AMO-fetch-or, this LOAD turns into a LOAD-STORE
cycle. Which means you cannot use it on RO memory -- also cache fail.

Surely just the volatile load and smp_mb() is faster and saner.



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux