On Fri, Nov 08, 2013 at 10:29:35AM +0000, Mathieu Desnoyers wrote: > > > +#define smp_store_release(p, v) \ > > > +do { \ > > > + compiletime_assert_atomic_type(*p); \ > > > + smp_mb(); \ > > > + ACCESS_ONCE(*p) = (v); \ > > > +} while (0) > > > + > > > +#define smp_load_acquire(p) \ > > > +({ \ > > > + typeof(*p) ___p1 = ACCESS_ONCE(*p); \ > > > + compiletime_assert_atomic_type(*p); \ > > > + smp_mb(); \ > > > + ___p1; \ > > > +}) > > > > Can you move those "generic" definitions into asm-generic/barrier.h > > under an ifdef guard ? > > > > The pattern using "smp_mb()" seems to be the right target for a generic > > implementation. > > > > We should probably document the requirements on sizeof(*p) and > > alignof(*p) directly above the macro definition. So that is the one in asm-generic; its just that some archs have a rather complex barrier.h and I didn't want to include asm-generic header to make it still worse. Better to then keep everything in a single file and suffer a little duplication. > > > +#define smp_store_release(p, v) \ > > > +do { \ > > > + compiletime_assert_atomic_type(*p); \ > > > + __lwsync(); \ > > > > Even though this is correct, it appears to bear more overhead than > > necessary. See arch/powerpc/include/asm/synch.h > > > > PPC_ACQUIRE_BARRIER and PPC_RELEASE_BARRIER > > > > You'll notice that some variants of powerpc require something more > > heavy-weight than a lwsync instruction. The fallback will be "isync" > > rather than "sync" if you use PPC_ACQUIRE_BARRIER and > > PPC_RELEASE_BARRIER rather than LWSYNC directly. I think Paul answered this. > > > +#else /* regular x86 TSO memory ordering */ > > > + > > > +#define smp_store_release(p, v) \ > > > +do { \ > > > + compiletime_assert_atomic_type(*p); \ > > > + barrier(); \ > > > + ACCESS_ONCE(*p) = (v); \ > > > +} while (0) > > > + > > > +#define smp_load_acquire(p) \ > > > +({ \ > > > + typeof(*p) ___p1 = ACCESS_ONCE(*p); \ > > > + compiletime_assert_atomic_type(*p); \ > > > + barrier(); \ > > > + ___p1; \ > > > +}) > > > > Hrm, I really don't get the two barrier() above. > > > > Or maybe we just need to document really well what's the semantic of a > > store_release and load_acquire. Hence the first patch; they provide the full ACQUIRE and RELEASE semantics. > > Furthermore, I don't see how a simple compiler barrier() can provide the > > acquire semantic within smp_load_acquire on x86 TSO. AFAIU, a smp_rmb() > > might be needed. I think the other Paul answered this one on how the TSO memory model provides all the required semantics. > > > +/* Is this type a native word size -- useful for atomic operations */ > > > +#ifndef __native_word > > > +# define __native_word(t) (sizeof(t) == sizeof(int) || sizeof(t) == > > > sizeof(long)) > > > +#endif > > > > Should we also check the pointer alignment, or that would be going too > > far ? I did consider that, but pointer alignment tests turn into code so I left those out. -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html