On Mon, May 02, 2022 at 06:38:03PM -0700, dai.ngo@xxxxxxxxxx wrote: > > On 5/2/22 6:21 PM, J. Bruce Fields wrote: > >On Mon, May 02, 2022 at 09:12:52PM -0400, J. Bruce Fields wrote: > >>Looks good to me. > >And the only new test failures are due to the new DELAYs on OPEN. > >Somebody'll need to fix up pynfs. (I'm not volunteering for now.) > > I will fix it, since I broke it :-) By the way, I have three more notes on courtesy server stuff that I wanted to dump into email before I forget them: 1. I do still recommend fixing up those pynfs failures. The ones I see are in RENEW3, LKU10, CLOSE9, CLOSE8, but there may be others. 2. In the lock case, nfsd4_lock() holds an st_mutex while calling vfs_lock_file(), which may end up needing to wait for the laundromat. As I said in review, I don't see a potential deadlock there, so I'm fine with the code going in as is. But, as a note for possible cleanup, or if this does turn into a problem later: vfs_lock_file could return to nfsd4_lock(), and nfsd4_lock() could easily drop the st_mutex, wait, and retry. I think the only trick part would be deciding on conventions for the caller to tell vfs_lock_file() that it shouldn't wait in this case (non-nfsd callers will still want to wait), and for vfs_lock_file() to indicate the caller needs to retry. Probably something in fl_flags for the former, and an agreed-on error return for the latter? 3. One other piece of future work would be optimizing the conflicting lock case. A very premature optimization at this point, but I'm just leaving my notes here in case someone's interested: The loop in posix_lock_inode() is currently O(N^2) in the number of expirable clients holding conflicting locks, because each time we encounter one, we wait and then restart. In practice I doubt that matters--if you have a lot of clients to expire, the time rescanning the list will likely be trivial compared to the time spent waiting for nfsdcld to commit the expiry of each client to stable storage. *However*, it might be a more significant optimization if we first allowed more parallelism in nfsdcld. And that might also benefit some other cases (e.g., lots of clients reconnecting after a crash). We'd need paralle nfsdcld--no idea what that would involve--and I think it'd also help to update the kernel<->nfsdcld protocol with a separate commit operation, so that nfsd could issue a bunch of client changes and then a single commit to wait for them all. That done, we could modify the loop in vfs_lock_file() so that, in the case where multiple clients hold conflicting locks, the loop marks them all for expiry in one pass, then waits just once at the end. --b.