Re: [PATCH 3/4] arch: Introduce smp_load_acquire(), smp_store_release()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----- Original Message -----
> From: "Peter Zijlstra" <peterz@xxxxxxxxxxxxx>
> To: "Mathieu Desnoyers" <mathieu.desnoyers@xxxxxxxxxxxx>
> Cc: linux-arch@xxxxxxxxxxxxxxx, geert@xxxxxxxxxxxxxx, paulmck@xxxxxxxxxxxxxxxxxx, torvalds@xxxxxxxxxxxxxxxxxxxx,
> VICTORK@xxxxxxxxxx, oleg@xxxxxxxxxx, anton@xxxxxxxxx, benh@xxxxxxxxxxxxxxxxxxx, fweisbec@xxxxxxxxx,
> michael@xxxxxxxxxxxxxx, mikey@xxxxxxxxxxx, linux@xxxxxxxxxxxxxxxx, schwidefsky@xxxxxxxxxx, "heiko carstens"
> <heiko.carstens@xxxxxxxxxx>, "tony luck" <tony.luck@xxxxxxxxx>, "Will Deacon" <will.deacon@xxxxxxx>
> Sent: Friday, November 8, 2013 6:08:18 AM
> Subject: Re: [PATCH 3/4] arch: Introduce smp_load_acquire(), smp_store_release()
> 
> On Fri, Nov 08, 2013 at 10:29:35AM +0000, Mathieu Desnoyers wrote:
> > > > +#define smp_store_release(p, v)						\
> > > > +do {									\
> > > > +	compiletime_assert_atomic_type(*p);				\
> > > > +	smp_mb();							\
> > > > +	ACCESS_ONCE(*p) = (v);						\
> > > > +} while (0)
> > > > +
> > > > +#define smp_load_acquire(p)						\
> > > > +({									\
> > > > +	typeof(*p) ___p1 = ACCESS_ONCE(*p);				\
> > > > +	compiletime_assert_atomic_type(*p);				\
> > > > +	smp_mb();							\
> > > > +	___p1;								\
> > > > +})
> > > 
> > > Can you move those "generic" definitions into asm-generic/barrier.h
> > > under an ifdef guard ?
> > > 
> > > The pattern using "smp_mb()" seems to be the right target for a generic
> > > implementation.
> > > 
> > > We should probably document the requirements on sizeof(*p) and
> > > alignof(*p) directly above the macro definition.
> 
> So that is the one in asm-generic; its just that some archs have a
> rather complex barrier.h and I didn't want to include asm-generic header
> to make it still worse. Better to then keep everything in a single file
> and suffer a little duplication.

OK, I guess it makes sense initially.

> 
> > > > +#define smp_store_release(p, v)						\
> > > > +do {									\
> > > > +	compiletime_assert_atomic_type(*p);				\
> > > > +	__lwsync();							\
> > > 
> > > Even though this is correct, it appears to bear more overhead than
> > > necessary. See arch/powerpc/include/asm/synch.h
> > > 
> > > PPC_ACQUIRE_BARRIER and PPC_RELEASE_BARRIER
> > > 
> > > You'll notice that some variants of powerpc require something more
> > > heavy-weight than a lwsync instruction. The fallback will be "isync"
> > > rather than "sync" if you use PPC_ACQUIRE_BARRIER and
> > > PPC_RELEASE_BARRIER rather than LWSYNC directly.
> 
> I think Paul answered this.

Yes,

> 
> > > > +#else /* regular x86 TSO memory ordering */
> > > > +
> > > > +#define smp_store_release(p, v)						\
> > > > +do {									\
> > > > +	compiletime_assert_atomic_type(*p);				\
> > > > +	barrier();							\
> > > > +	ACCESS_ONCE(*p) = (v);						\
> > > > +} while (0)
> > > > +
> > > > +#define smp_load_acquire(p)						\
> > > > +({									\
> > > > +	typeof(*p) ___p1 = ACCESS_ONCE(*p);				\
> > > > +	compiletime_assert_atomic_type(*p);				\
> > > > +	barrier();							\
> > > > +	___p1;								\
> > > > +})
> > > 
> > > Hrm, I really don't get the two barrier() above.
> > > 
> > > Or maybe we just need to document really well what's the semantic of a
> > > store_release and load_acquire.
> 
> Hence the first patch; they provide the full ACQUIRE and RELEASE
> semantics.

Yes, got it following Paul's explanation.

> 
> > > Furthermore, I don't see how a simple compiler barrier() can provide the
> > > acquire semantic within smp_load_acquire on x86 TSO. AFAIU, a smp_rmb()
> > > might be needed.
> 
> I think the other Paul answered this one on how the TSO memory model
> provides all the required semantics.

Yes, that's right.

> 
> > > > +/* Is this type a native word size -- useful for atomic operations */
> > > > +#ifndef __native_word
> > > > +# define __native_word(t) (sizeof(t) == sizeof(int) || sizeof(t) ==
> > > > sizeof(long))
> > > > +#endif
> > > 
> > > Should we also check the pointer alignment, or that would be going too
> > > far ?
> 
> I did consider that, but pointer alignment tests turn into code so I
> left those out.

This is what I suspected. I'd still recommend commenting the requirements on "p" above the implementation of smp_store_release() and smp_store_acquire(), so callers clearly know what to expect. Other than this small nit (which you may choose to disregard at will), you can add my:

Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux