On Wed, Jun 16, 2021 at 07:29:37PM +0000, Chuck Lever III wrote: > > > > On Jun 16, 2021, at 3:25 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote: > > > > On 6/16/21 9:32 AM, Chuck Lever III wrote: > >>> On Jun 16, 2021, at 12:02 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > >>> > >>> On Thu, Jun 03, 2021 at 02:14:38PM -0400, Dai Ngo wrote: > >>>> . instead of destroy the client anf all its state on conflict, only destroy > >>>> the state that is conflicted with the current request. > >>> The other todos I think have to be done before we merge, but this one I > >>> think can wait. > >> I agree on both points: this one can wait, but the others > >> should be done before merge. > > > > yes, will do. > > > >> > >> > >>>> . destroy the COURTESY_CLIENT either after a fixed period of time to release > >>>> resources or as reacting to memory pressure. > >>> I think we need something here, but it can be pretty simple. > >> We should work out a policy now. > >> > >> A lower bound is good to have. Keep courtesy clients at least > >> this long. Average network partition length times two as a shot > >> in the dark. Or it could be N times the lease expiry time. > >> > >> An upper bound is harder to guess at. Obviously these things > >> will go away when the server reboots. The laundromat could > >> handle this sooner. However using a shrinker might be nicer and > >> more Linux-y, keeping the clients as long as practical, without > >> the need for adding another administrative setting. > > > > Can we start out with a simple 12 or 24 hours to accommodate long > > network outages for this phase? > > Sure. Let's go with 24 hours. > > Bill suggested adding a "clear_locks" like mechanism that could be > used to throw out all courteous clients at once. Maybe another > phase 2 project! For what it's worth, you can forcibly expire a client by writing "expire" to /proc/fs/nfsd/client/xxx/ctl. So it shouldn't be hard to script this, if we add some kind of "courtesy" flag to client_info_show() and/or a number of seconds since the most recent renew. Maybe adding a command like "expire_if_courtesy" would also simplify that and avoid a race where the renew comes in simultaneously with the expire command. Or we could just add a single call to clear all courtesy clients. But the per-client approach would allow more flexibility if you wanted (e.g. to throw out only clients over a certain age). --b.