On Thu, 2020-12-03 at 16:13 -0500, bfields@xxxxxxxxxxxx wrote: > On Thu, Dec 03, 2020 at 08:27:39PM +0000, Trond Myklebust wrote: > > On Thu, 2020-12-03 at 13:51 -0500, bfields wrote: > > > I've been scratching my head over how to handle reboot of a re- > > > exporting > > > server. I think one way to fix it might be just to allow the re- > > > export > > > server to pass along reclaims to the original server as it > > > receives > > > them > > > from its own clients. It might require some protocol tweaks, I'm > > > not > > > sure. I'll try to get my thoughts in order and propose > > > something. > > > > > > > It's more complicated than that. If the re-exporting server > > reboots, > > but the original server does not, then unless that re-exporting > > server > > persisted its lease and a full set of stateids somewhere, it will > > not > > be able to atomically reclaim delegation and lock state on the > > server > > on behalf of its clients. > > By sending reclaims to the original server, I mean literally sending > new > open and lock requests with the RECLAIM bit set, which would get > brand > new stateids. > > So, the original server would invalidate the existing client's > previous > clientid and stateids--just as it normally would on reboot--but it > would > optionally remember the underlying locks held by the client and allow > compatible lock reclaims. > > Rough attempt: > > https://wiki.linux-nfs.org/wiki/index.php/Reboot_recovery_for_re-export_servers > > Think it would fly? So this would be a variant of courtesy locks that can be reclaimed by the client using the reboot reclaim variant of OPEN/LOCK outside the grace period? The purpose being to allow reclaim without forcing the client to persist the original stateid? Hmm... That's doable, but how about the following alternative: Add a function that allows the client to request the full list of stateids that the server holds on its behalf? I've been wanting such a function for quite a while anyway in order to allow the client to detect state leaks (either due to soft timeouts, or due to reordered close/open operations). -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx