> On Jan 11, 2019, at 7:56 PM, Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > On Fri, Jan 11, 2019 at 05:27:30PM -0500, Chuck Lever wrote: >> >> >>> On Jan 11, 2019, at 5:10 PM, Bruce Fields <bfields@xxxxxxxxxxxx> wrote: >>> >>> On Fri, Jan 11, 2019 at 04:54:01PM -0500, Chuck Lever wrote: >>>>> On Jan 11, 2019, at 4:52 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: >>>>>> So, I think we need your patch plus something like this. >>>>>> >>>>>> Chuck, maybe you could help me with the "XXX: Chuck:" parts? >>>>> >>>>> I haven't been following. Why do you think those are necessary? >>> >>> I'm worried something like this could happen: >>> >>> CPU 1 CPU 2 >>> ----- ----- >>> >>> set XPT_DATA dec xpt_nr_rqsts >>> >>> svc_xprt_enqueue svc_xprt_enqueue >>> >>> And both decide nothing should be done if neither sees the change that >>> the other made. >>> >>> Maybe I'm still missing some reason that couldn't happen. >>> >>> Even if it can happen, it's an unlikely race that will likely be fixed >>> when another event comes along a little later, which would explain why >>> we've never seen any reports. >>> >>>>> We've had set_bit and atomic_{inc,dec} in this code for ages, >>>>> and I've never noticed a problem. >>>>> >>>>> Rather than adding another CPU pipeline bubble in the RDMA code, >>>>> though, could you simply move the set_bit() call site inside the >>>>> critical sections? >>>> >>>> er, inside the preceding critical section. Just reverse the order >>>> of the spin_unlock and the set_bit. >>> >>> That'd do it, thanks! >> >> I can try that here and see if it results in a performance regression. > > Thanks, I've got a version with a typo fixed at > > git://linux-nfs.org/~bfields/linux.git nfsd-next Applied all four patches here. I don't see any performance regressions, but my server has only a single last-level CPU cache. -- Chuck Lever