On Wed, 19 Dec 2018, J. Bruce Fields wrote: > On Wed, Dec 19, 2018 at 05:21:47PM -0500, J. Bruce Fields wrote: > > On Wed, Dec 19, 2018 at 05:05:45PM -0500, Scott Mayhew wrote: > > > What if a client sends a RECLAIM_COMPLETE, then reboots and sends an > > > EXCHANGE_ID, CREATE_SESSION, and RECLAIM_COMPLETE while the server is > > > still in grace? The count would be too high then and the server could > > > exit grace before all the clients have reclaimed. I actually added > > > that at Jeff's suggestion because he was seeing it with nfs-ganesha. > > > > Oh boy. > > > > (Thinks.) > > > > Once it issues a DESTROY_CLIENTID or an EXCHANGE_ID that removes the > > previous client instance's state, it's got no locks to reclaim any more. > > (It can't have gotten any *new* ones, since we're still in the grace > > period.) > > > > It's effectively a brand new client. Only reclaiming clients should > > bump that counter. > > > > We certainly shouldn't be waiting for it to RECLAIM_COMPLETE to end the > > grace period, that client just doesn't matter any more. > > Actually, once the client's destroyed, it shouldn't matter whether the > previous incarnation of the client reclaimed or not. It's never going > to reclaim now.... So expire_client should probably just be removing > the client from the table of reclaimable clients at the same time that > it removes its stable storage record. Okay, come to think of it, if we're in grace then nfsdcld should be removing the client record from both the current and recovery epoch's db tables too... > > --b.