On Wed, Apr 13, 2022 at 08:55:50AM -0400, Bruce Fields wrote: > On Fri, Apr 01, 2022 at 12:11:34PM -0700, dai.ngo@xxxxxxxxxx wrote: > > On 4/1/22 8:57 AM, Chuck Lever III wrote: > > >>(And to be honest I'd still prefer the original approach where we expire > > >>clients from the posix locking code and then retry. It handles an > > >>additional case (the one where reboot happens after a long network > > >>partition), and I don't think it requires adding these new client > > >>states....) > > >The locking of the earlier approach was unworkable. > > > > > >But, I'm happy to consider that again if you can come up with a way > > >of handling it properly and simply. > > > > I will wait for feedback from Bruce before sending v20 with the > > above change. > > OK, I'd like to tweak the design in that direction. > > I'd like to handle the case where the network goes down for a while, and > the server gets power-cycled before the network comes back up. I think > that could easily happen. There's no reason clients couldn't reclaim > all their state in that case. We should let them. > > To handle that case, we have to delay removing the client's stable > storage record until there's a lock conflict. That means code that > checks for conflicts must be able to sleep. > > In each case (opens, locks, delegations), conflicts are first detected > while holding a spinlock. So we need to unlock before waiting, and then > retry if necessary. > > We decided instead to remove the stable-storage record when first > converting a client to a courtesy client--then we can handle a conflict > by just setting a flag on the client that indicates it should no longer > be used, no need to drop any locks. > > That leaves the client in a state where it's still on a bunch of global > data structures, but has to be treated as if it no longer exists. That > turns out to require more special handling than expected. You've shown > admirable persistance in handling those cases, but I'm still not > completely convinced this is correct. > > We could avoid that complication, and also solve the > server-reboot-during-network-partition problem, if we went back to the > first plan and allowed ourselves to sleep at the time we detect a > conflict. I don't think it's that complicated. > > We end up using a lot of the same logic regardless, so don't throw away > the existing patches. > > My basic plan is: > > Keep the client state, but with only three values: ACTIVE, COURTESY, and > EXPIRABLE. > > ACTIVE is the initial state, which we return to whenever we renew. The > laundromat sets COURTESY whenever a client isn't renewed for a lease > period. When we run into a conflict with a lock held by a client, we > call > > static bool try_to_expire_client(struct nfs4_client *clp) > { > return COURTESY == cmpxchg(clp->cl_state, COURTESY, EXPIRABLE); > } > > If it returns true, that tells us the client was a courtesy client. We > then call queue_work(laundry_wq, &nn->laundromat_work) to tell the > laundromat to actually expire the client. Then if needed we can drop > locks, wait for the laundromat to do the work with > flush_workqueue(laundry_wq), and retry. > > All the EXPIRABLE state does is tell the laundromat to expire this > client. It does *not* prevent the client from being renewed and > acquiring new locks--if that happens before the laundromat gets to the > client, that's fine, we let it return to ACTIVE state and if someone > retries the conflicing lock they'll just get a denial. > > Here's a suggested a rough patch ordering. If you want to go above and > beyond, I also suggest some tests that should pass after each step: > > > PATCH 1 > ------- > > Implement courtesy behavior *only* for clients that have > delegations, but no actual opens or locks: > > Define new cl_state field with values ACTIVE, COURTESY, and EXPIRABLE. > Set to ACTIVE on renewal. Modify the laundromat so that instead of > expiring any client that's too old, it first checks if a client has > state consisting only of unconflicted delegations, and, if so, it sets > COURTESY. > > Define try_to_expire_client as above. In nfsd_break_deleg_cb, call > try_to_expire_client and queue_work. (But also continue scheduling the > recall as we do in the current code, there's no harm to that.) > > Modify the laundromat to try to expire old clients with EXPIRED set. > > TESTS: > - Establish a client, open a file, get a delegation, close the > file, wait 2 lease periods, verify that you can still use the > delegation. > - Establish a client, open a file, get a delegation, close the > file, wait 2 lease periods, establish a second client, request > a conflicting open, verify that the open succeeds and that the > first client is no longer able to use its delegation. > > > PATCH 2 > ------- > > Extend courtesy client behavior to clients that have opens or > delegations, but no locks: > > Modify the laundromat to set COURTESY on old clients with state > consisting only of opens or unconflicted delegations. > > Add in nfs4_resolve_deny_conflicts_locked and friends as in your patch > "Update nfs4_get_vfs_file()...", but in the case of a conflict, call > try_to_expire_client and queue_work(), then modify e.g. > nfs4_get_vfs_file to flush_workqueue() and then retry after unlocking > fi_lock. > > TESTS: > - establish a client, open a file, wait 2 lease periods, verify > that you can still use the open stateid. > - establish a client, open a file, wait 2 lease periods, > establish a second client, request an open with a share mode > conflicting with the first open, verify that the open succeeds > and that first client is no longer able to use its open. > > PATCH 3 > ------- > > Minor tweak to prevent the laundromat from being freed out from > under a thread processing a conflicting lock: > > Create and destroy the laundromat workqueue in init_nfsd/exit_nfsd > instead of where it's done currently. > > (That makes the laundromat's lifetime longer than strictly necessary. > We could do better with a little more work; I think this is OK for now.) > > TESTS: > - just rerun any regression tests; this patch shouldn't change > behavior. > > PATCH 4 > ------- > > Extend courtesy client behavior to any client with state, including > locks: > > Modify the laundromat to set COURTESY on any old client with state. > > Add two new lock manager callbacks: > > void * (*lm_lock_expirable)(struct file_lock *); > bool (*lm_expire_lock)(void *); > > If lm_lock_expirable() is called and returns non-NULL, posix_lock_inode > should drop flc_lock, call lm_expire_lock() with the value returned from > lm_lock_expirable, and then restart the loop over flc_posix from the > beginning. > > For now, nfsd's lm_lock_expirable will basically just be > > if (try_to_expire_client()) { > queue_work() > return get_net(); Correction: I forgot that the laundromat is global, not per-net. So, we can skip the put_net/get_net. Also, lm_lock_expirable can just return bool instead of void *, and lm_expire_lock needs no arguments. --b. > } > return NULL; > > and lm_expire_lock will: > > flush_workqueue() > put_net() > > One more subtlety: the moment we drop the flc_lock, it's possible > another task could race in and free it. Worse, the nfsd module could be > removed entirely--so nfsd's lm_expire_lock code could disappear out from > under us. To prevent this, I think we need to add a struct module > *owner field to struct lock_manager_operations, and use it like: > > owner = fl->fl_lmops->owner; > __get_module(owner); > expire_lock = fl->fl_lmops->lm_expire_lock; > spin_unlock(&ctx->flc_lock); > expire_lock(...); > module_put(owner); > > Maybe there's some simpler way, but I don't see it. > > TESTS: > - retest courtesy client behavior using file locks this time. > > -- > > That's the basic idea. I think it should work--though I may have > overlooked something. > > This has us flush the laundromat workqueue while holding mutexes in a > couple cases. We could avoid that with a little more work, I think. > But those mutexes should only be associated with the client requesting a > new open/lock, and such a client shouldn't be touched by the laundromat, > so I think we're OK. > > It'd also be helpful to update the info file with courtesy client > information, as you do in your current patches. > > Does this make sense? > > --b.