On Thu, Dec 03, 2020 at 10:14:25PM +0000, Trond Myklebust wrote: > On Thu, 2020-12-03 at 17:04 -0500, bfields@xxxxxxxxxxxx wrote: > > On Thu, Dec 03, 2020 at 09:57:41PM +0000, Trond Myklebust wrote: > > > On Thu, 2020-12-03 at 13:45 -0800, Frank Filz wrote: > > > > > On Thu, 2020-12-03 at 16:13 -0500, bfields@xxxxxxxxxxxx wrote: > > > > > > On Thu, Dec 03, 2020 at 08:27:39PM +0000, Trond Myklebust > > > > > > wrote: > > > > > > > On Thu, 2020-12-03 at 13:51 -0500, bfields wrote: > > > > > > > > I've been scratching my head over how to handle reboot of > > > > > > > > a > > > > > > > > re- > > > > > > > > exporting server. I think one way to fix it might be > > > > > > > > just to > > > > > > > > allow the re- export server to pass along reclaims to the > > > > > > > > original > > > > > > > > server as it receives them from its own clients. It > > > > > > > > might > > > > > > > > require > > > > > > > > some protocol tweaks, I'm not sure. I'll try to get my > > > > > > > > thoughts > > > > > > > > in order and propose something. > > > > > > > > > > > > > > > > > > > > > > It's more complicated than that. If the re-exporting server > > > > > > > reboots, > > > > > > > but the original server does not, then unless that re- > > > > > > > exporting > > > > > > > server persisted its lease and a full set of stateids > > > > > > > somewhere, it > > > > > > > will not be able to atomically reclaim delegation and lock > > > > > > > state on > > > > > > > the server on behalf of its clients. > > > > > > > > > > > > By sending reclaims to the original server, I mean literally > > > > > > sending > > > > > > new open and lock requests with the RECLAIM bit set, which > > > > > > would > > > > > > get > > > > > > brand new stateids. > > > > > > > > > > > > So, the original server would invalidate the existing > > > > > > client's > > > > > > previous clientid and stateids--just as it normally would on > > > > > > reboot--but it would optionally remember the underlying locks > > > > > > held by > > > > > > the client and allow compatible lock reclaims. > > > > > > > > > > > > Rough attempt: > > > > > > > > > > > > > > > > > > https://wiki.linux-nfs.org/wiki/index.php/Reboot_recovery_for_re-expor > > > > > > t_servers > > > > > > > > > > > > Think it would fly? > > > > > > > > > > So this would be a variant of courtesy locks that can be > > > > > reclaimed > > > > > by the client > > > > > using the reboot reclaim variant of OPEN/LOCK outside the grace > > > > > period? The > > > > > purpose being to allow reclaim without forcing the client to > > > > > persist the original > > > > > stateid? > > > > > > > > > > Hmm... That's doable, but how about the following alternative: > > > > > Add > > > > > a function > > > > > that allows the client to request the full list of stateids > > > > > that > > > > > the server holds on > > > > > its behalf? > > > > > > > > > > I've been wanting such a function for quite a while anyway in > > > > > order > > > > > to allow the > > > > > client to detect state leaks (either due to soft timeouts, or > > > > > due > > > > > to reordered > > > > > close/open operations). > > > > > > > > Oh, that sounds interesting. So basically the re-export server > > > > would > > > > re-populate it's state from the original server rather than > > > > relying > > > > on it's clients doing reclaims? Hmm, but how does the re-export > > > > server rebuild its stateids? I guess it could make the clients > > > > repopulate them with the same "give me a dump of all my state", > > > > using > > > > the state details to match up with the old state and replacing > > > > stateids. Or did you have something different in mind? > > > > > > > > > > I was thinking that the re-export server could just use that list > > > of > > > stateids to figure out which locks can be reclaimed atomically, and > > > which ones have been irredeemably lost. The assumption is that if > > > you > > > have a lock stateid or a delegation, then that means the clients > > > can > > > reclaim all the locks that were represented by that stateid. > > > > I'm confused about how the re-export server uses that list. Are you > > assuming it persisted its own list across its own crash/reboot? I > > guess > > that's what I was trying to avoid having to do. > > > No. The server just uses the stateids as part of a check for 'do I hold > state for this file on this server?'. If the answer is 'yes' and the > lock owners are sane, then we should be able to assume the full set of > locks that lock owner held on that file are still valid. > > BTW: if the lock owner is also returned by the server, then since the > lock owner is an opaque value, it could, for instance, be used by the > client to cache info on the server about which uid/gid owns these > locks. OK, so the list of stateids returned by the server has entries that look like (type, filehandle, owner, stateid) (where type=open or lock?). I guess I'd need to see this in more detail. --b.