On Wed, 2019-01-09 at 11:51 -0500, bfields@xxxxxxxxxxxx wrote: > On Tue, Jan 08, 2019 at 04:21:40PM +0000, Trond Myklebust wrote: > > On Tue, 2019-01-08 at 10:01 -0500, bfields@xxxxxxxxxxxx wrote: > > > On Mon, Jan 07, 2019 at 10:06:19PM +0000, Trond Myklebust wrote: > > > > On Mon, 2019-01-07 at 16:32 -0500, bfields@xxxxxxxxxxxx wrote: > > > > > So maybe we actually need > > > > > > > > > > static bool (struct svc_xprt *xprt) > > > > > { > > > > > + mb(); > > > > > > > > You would at best need a 'smp_rmb()'. There is nothing to gain > > > > from > > > > adding a write barrier here, > > > > > > That's not my understanding. > > > > > > What we have is basically: > > > > > > 1 2 > > > ---- ---- > > > WRITE to A WRITE to B > > > > > > READ from A and B READ from A and B > > > > > > and we want to guarantee that at least one of those two reads > > > will > > > see > > > both of the writes. > > > > > > A read barrier only orders reads with respect to the barrier, it > > > doesn't > > > do anything about writes, so doesn't guarantee anything here. > > > > In this context 'WRITE to A' and/or 'WRITE to B' are presumably the > > operations of setting the flag bits in xprt->xpt_flags, no? > > Right, or I guess sk_sock->flags, or an atomic operation on > xpt_reserved > or xpt_nr_rqsts. > > > That's not occurring here, it is occurring elsewhere. > > Right. And I hadn't tried to verify whether there were corresponding > (possibly implicit) write barriers in those places, thanks for doing > that work: > > > The test_and_set_bit(XPT_DATA, &xprt->xpt_flags) in > > svc_data_ready() > > performs an explicit barrier, so we shouldn't really care. > > OK. > > > The other cases where we do set_bit(XPT_DATA) don't matter since > > the > > socket has its own locking, and so the XPT_DATA is really just a > > test > > for whether or not we need to enqueue the svc_xprt. > > I'm not following, apologies. > > In any case it's set only on initialization or in recvfrom, and in > the > recvfrom case I think the > > smp_mb__before_atomic(); > clear_bit(XPT_BUSY, &xprt->xpt_flags); > > in svc_xprt_received() provides the necessary write barrier. > > But there are some exceptions in the rdma code, in > svc_rdma_wc_receive > and svc_rdma_wc_done. > > > In the only place where XPT_DEFERRED is set, you have an implicit > > write > > barrier (due to a spin_unlock) between the call to set_bit() and > > the > > call to svc_xprt_enqueue(), so all data writes are guaranteed to be > > complete before any attempt to enqueue the socket. > > OK. > > > I can't see that you really care for the case of XPT_CONN, since > > the > > just-created socket isn't going to be visible to other cpus until > > you've added it to &pool->sp_sockets (which also has implicit write > > barriers due to spin locks). > > > > I don't think you really care for the case of XPT_CLOSE either > > since > > svc_delete_xprt() doesn't depend on any other data writes that > > aren't > > already protected by spinlocks. > > OK. Yes, I'm not worried about XPT_CONN or XPT_CLOSE. > > > So the conclusion would be to add smp_rmb() in > > svc_xprt_has_something_to_do(). No extra write barriers are needed > > AFAICS. > > You may still need the READ_ONCE() in order to add a data > > dependency > > barrier (i.e. to ensure that alpha processors don't reorder reads > > of > > the xpt_flags with other speculative reads). That should reduce to > > a > > standard read on all non-alpha architectures. > > That looks unnecessary; memory-barriers.txt say "Read memory barriers > imply data dependency barriers", and later "As of v4.15 of the Linux > kernel, an smp_read_barrier_depends() was added to READ_ONCE()". > The above is stating that smp_rmb(); smp_read_barrier_depends(); if (xprt->xpt_flags & ....) is redundant and can be replaced with just smp_rmb(); if (xprt->xpt_flags & ....) However that's not the case for smp_rmb() followed by READ_ONCE(). That would expand to smp_rmb(); if (xprt->xpt_flags & ...) { smp_read_barrier_depends(); } else smp_read_barrier_depends(); which is not redundant. It is ensuring (on alpha only) that the read of xprt->xpt_flags is also not re-ordered w.r.t. other data reads that follow. See, for instance, kernel/events/core.c which has several examples, or kernel/exit.c. > I still wonder about: > > - the RDMA cases above. > - svc_xprt_release_slot: no write barrier after writing to > xprt->xpt_nr_rqsts. > - svc_reserve: no barrier after writing to xpt_reserved > > Also svc_write_space is setting SOCK_NOSPACE and then calling > svc_xprt_enqueue. I'm pretty sure the sk_write_space method has to > have > a write barrier after that, though, so this is OK. > > --b. > > > > --b. > > > > > > > > > > > > > and you don't even need a read barrier in > > > > the non-smp case. > > > > > > > > > if (xprt->xpt_flags & ((1<<XPT_CONN)|(1<<XPT_CLOSE))) > > > > > return true; > > > > > if (xprt->xpt_flags & > > > > > ((1<<XPT_DATA)|(1<<XPT_DEFERRED))) { > > > > > > > > > > Then whichever memory barrier executes second guarantees that > > > > > the > > > > > following check sees the result of both the XPT_DATA and > > > > > xpt_nr_rqsts > > > > > changes. I think.... -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx