On Thu, 2020-12-03 at 18:16 -0500, bfields@xxxxxxxxxxxx wrote: > On Thu, Dec 03, 2020 at 10:53:26PM +0000, Trond Myklebust wrote: > > On Thu, 2020-12-03 at 17:45 -0500, bfields@xxxxxxxxxxxx wrote: > > > On Thu, Dec 03, 2020 at 09:34:26PM +0000, Trond Myklebust wrote: > > > > I've been wanting such a function for quite a while anyway in > > > > order to allow the client to detect state leaks (either due to > > > > soft timeouts, or due to reordered close/open operations). > > > > > > One sure way to fix any state leaks is to reboot the server. The > > > server throws everything away, the clients reclaim, all that's > > > left > > > is stuff they still actually care about. > > > > > > It's very disruptive. > > > > > > But you could do a limited version of that: the server throws > > > away > > > the state from one client (keeping the underlying locks on the > > > exported filesystem), lets the client go through its normal > > > reclaim > > > process, at the end of that throws away anything that wasn't > > > reclaimed. The only delay is to anyone trying to acquire new > > > locks > > > that conflict with that set of locks, and only for as long as it > > > takes for the one client to reclaim. > > > > One could do that, but that requires the existence of a quiescent > > period where the client holds no state at all on the server. > > No, as I said, the client performs reboot recovery for any state that > it > holds when we do this. > Hmm... So how do the client and server coordinate what can and cannot be reclaimed? The issue is that races can work both ways, with the client sometimes believing that it holds a layout or a delegation that the server thinks it has returned. If the server allows a reclaim of such a delegation, then that could be problematic (because it breaks lock atomicity on the client and because it may cause conflicts). By the way, the other thing that I'd like to add to my wishlist is a callback that allows the server to ask the client if it still holds a given open or lock stateid. A server can recall a delegation or a layout, so it can fix up leaks of those, however it has no remedy if the client loses an open or lock stateid other than to possibly forcibly revoke state. That could cause application crashes if the server makes a mistake and revokes a lock that is actually in use. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx