Re: [RFC PATCH 2/2] RDMA/rxe: Add RDMA Atomic Write operation

Jason Gunthorpe <jgg@xxxxxxxx> · Fri, 7 Jan 2022 15:28:22 -0400

On Fri, Jan 07, 2022 at 10:38:30AM -0500, Tom Talpey wrote:
> 
> On 1/7/2022 7:22 AM, Jason Gunthorpe wrote:
> > On Fri, Jan 07, 2022 at 02:15:25AM +0000, yangx.jy@xxxxxxxxxxx wrote:
> > > On 2022/1/6 21:00, Jason Gunthorpe wrote:
> > > > On Thu, Jan 06, 2022 at 10:52:47AM +0000, yangx.jy@xxxxxxxxxxx wrote:
> > > > > On 2022/1/6 7:53, Jason Gunthorpe wrote:
> > > > > > On Thu, Dec 30, 2021 at 04:39:01PM -0500, Tom Talpey wrote:
> > > > > > 
> > > > > > > Because RXE is a software provider, I believe the most natural approach
> > > > > > > here is to use an atomic64_set(dst, *src).
> > > > > > A smp_store_release() is most likely sufficient.
> > > > > Hi Jason, Tom
> > > > > 
> > > > > Is smp_store_mb() better here? It calls WRITE_ONCE + smb_mb/barrier().
> > > > > I think the semantics of 'atomic write' is to do atomic write and make
> > > > > the 8-byte data reach the memory.
> > > > No, it is not 'data reach memory' it is a 'release' in that if the CPU
> > > > later does an 'acquire' on the written data it is guarenteed to see
> > > > all the preceeding writes.
> > > Hi Jason, Tom
> > > 
> > > Sorry for the wrong statement. I mean that the semantics of 'atomic
> > > write' is to write an 8-byte value atomically and make the 8-byte value
> > > visible for all CPUs.
> > > 'smp_store_release' makes all the preceding writes visible for all CPUs
> > > before doing an atomic write. I think this guarantee should be done by
> > > the preceding 'flush'.
> 
> An ATOMIC_WRITE is not required to provide visibility for prior writes,
> but it *must* be ordered after those writes. 

It doesn't make much sense to really talk about "visibility", it is
very rare something would need something to fully stop until other
things can see it.

What we generally talk about these days is only order.

This is what release/acquire is about. smp_store_release() says that
someone doing smp_load_acquire() on the same data is guaranteed to
observe the previous writes if it observes the data that was written.

Eg if you release a head pointer in a queue then acquiring the new
head pointer value also guarentees that all data in the queue is
visible to you.

However, release doesn't say anything about *when* other observers may
have this visibility, and it certainly doesn't stop and wait until all
observers are guarenteed to see the new data.

> ATOMIC_WRITE, then there's nothing to do. But in other workloads, it is
> still mandatory to provide the ordering. It's probably easiest, and no
> less expensive, to just wmb() before processing the ATOMIC_WRITE.

Which is what smp_store_release() does:

#define __smp_store_release(p, v)                                       \
do {                                                                    \
        __smp_mb();                                                     \
        WRITE_ONCE(*p, v);                                              \
} while (0)

Notice this is the opposite of what smpt_store_mb() does:

#define __smp_store_mb(var, value)  \
do { \
        WRITE_ONCE(var, value); \
        __smp_mb(); \
} while (0)

Which is *not* a release and does *not* guarentee order properties. It
is very similar to what FLUSH would provide in IBA, and very few
things benefit from this. (Indeed, I suspect many of the users in the
kernel are wrong, looking at you SIW..)

> Xiao Yang - where do you see the spec requiring that the ATOMIC_WRITE
> 64-bit payload be made globally visible as part of its execution? 

I don't see this either. I don't think IBA contemplates something
analogous to 'sequentially consistent ordering'.

Jason