Re: [PATCH for-next v2 2/4] IB/hfi1: Move rvt_cq_wc struct into uapi directory

Jason Gunthorpe <jgg@xxxxxxxx> · Tue, 5 Feb 2019 12:41:18 -0700

On Tue, Feb 05, 2019 at 07:22:50PM +0000, Arumugam, Kamenee wrote:
> >> Here is the usage for RDMA_WRITE_UAPI_ATOMIC macro. For cq enter, we 
> >> kept smp_wmb() in the code as it is now and we didn't adopt it into 
> > >this macro. This macro Is meant for compile barrier.
> 
> > That should be using smp_store_release()/load_acquire(), not open 
> > coding barriers.
> > 
> We are concerned that the barrier() in __smp_store_release() isn't
> strong enough on all platforms.

The platform must provide barriers inside acquire/release that are
strong enough to meet the behavioral definition from the memory
model. 

Ie the acquire is guaranteed to see all stores executed by another CPU
prior to release.

I'm not sure what you mean by 'strong enough'

> > When working with unlocked data like this it is important to be 
> > careful with these things. READ/WRITE_ONCE are only correct if the 
> > single value is required, if you are doing anything that has a 
> > dependency on other values then you have to use acquire/release 
> > semantics to ensure global ordering is achieved.
>
> The smp_store_release() has a compile time assert that prevents the tearing.
>
> I assume that is why you are suggesting the smp_store_release().

No, not quite, there is a difference between reading the single value
and being able to observe any prior stores to other values.

> > These mirror to the userspace stdatomic definitions, so userspace must 
> > also use the proper pairs - ie a smp_loac_acquire() in kernel must be 
> > paired with an atomic_store_explicit(memory_order_release) - and vice 
> > versa
> > 
> > Similar for the READ/WRITE_ONCE but it should use 
> > memory_order_relaxed.
> > 
> 
> So are you now suggesting to use the smp_load_acquire() instead of READ_ONCE()?

It depends entirely on what your algorithm is doing. If you can use
relaxed ordering then READ/WITE_ONCE are OK. If you need to have
acquire/release semantics then you should use acquire/release.

An open coded barrier here is rarely a good idea.

Jason