Re: Re: Re: [PATCH] rdma/siw: avoid smp_store_mb() on a u64

Jason Gunthorpe <jgg@xxxxxxxx> · Fri, 12 Jul 2019 11:42:57 -0300

On Fri, Jul 12, 2019 at 02:35:50PM +0000, Bernard Metzler wrote:

> >This looks wrong to me.. a userspace notification re-arm cannot be
> >lost, so have a split READ/TEST/WRITE sequence can't possibly work?
> >
> >I'd expect an atomic test and clear here?
> 
> We cannot avoid the case that the application re-arms the
> CQ only after a CQE got placed. That is why folks are polling the
> CQ once after re-arming it - to make sure they do not miss the
> very last and single CQE which would have produced a CQ event.

That is different, that is re-arm happing after a CQE placement and
this can't be fixed.

What I said is that a re-arm from userspace cannot be lost. So you
can't blindly clear the arm flag with the WRITE_ONCE. It might be OK
beacuse of the if, but...

It is just goofy to write it without a 'test and clear' atomic. If the
writer side consumes the notify it should always be done atomically.

And then I think all the weird barriers go away

> >> @@ -1141,11 +1145,17 @@ int siw_req_notify_cq(struct ib_cq
> >*base_cq, enum ib_cq_notify_flags flags)
> >>  	siw_dbg_cq(cq, "flags: 0x%02x\n", flags);
> >>  
> >>  	if ((flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED)
> >> -		/* CQ event for next solicited completion */
> >> -		smp_store_mb(*cq->notify, SIW_NOTIFY_SOLICITED);
> >> +		/*
> >> +		 * Enable CQ event for next solicited completion.
> >> +		 * and make it visible to all associated producers.
> >> +		 */
> >> +		smp_store_mb(cq->notify->flags, SIW_NOTIFY_SOLICITED);
> >
> >But what is the 2nd piece of data to motivate the smp_store_mb?
> 
> Another core (such as a concurrent RX operation) shall see this
> CQ being re-armed asap.

'ASAP' is not a '2nd piece of data'. 

AFAICT this requirement is just a normal atomic set_bit which does
also expedite making the change visible?

Jason