On Fri, 12 Sep 2014 12:29:01 -0400 Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> wrote: > On Fri, Sep 12, 2014 at 12:07 PM, Jeff Layton > <jeff.layton@xxxxxxxxxxxxxxx> wrote: > > On Fri, 12 Sep 2014 11:54:17 -0400 > > Trond Myklebust <trondmy@xxxxxxxxx> wrote: > > > >> On Fri, Sep 12, 2014 at 11:21 AM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > >> > On Fri, Sep 12, 2014 at 10:36:21AM -0400, J. Bruce Fields wrote: > >> >> On Fri, Sep 12, 2014 at 10:21:53AM -0400, Jeff Layton wrote: > >> >> > Grace period > >> >> > eventually ends, and its record is purged from the DB. > >> >> > > >> >> > Now we have a client that has reclaimed some files but that has no > >> >> > record on stable storage. > >> >> > > >> >> > One possibility is to prematurely expire v4.1+ clients that have not > >> >> > sent a RECLAIM_COMPLETE when the grace period ends. > >> >> > > >> >> > That seems problematic though -- what about clients that just happen to > >> >> > do an EXCHANGE_ID just before the grace period is going to end, and > >> >> > that get expired before they can issue their RECLAIM_COMPLETE. Will > >> >> > that be a problem for them? > >> >> > >> >> In that case a client will send a reclaim, get back a NO_GRACE error, > >> >> mark the rest of its state as unrecoverable, send the RECLAIM_COMPLETE, > >> >> and continue normally. (To the extent it can--signalling affected > >> >> processes or EIOing further attempts to use the unreclaimed state, or > >> >> whatever.) > >> > > >> > The one thing the server *could* do in this sort of case is extend the > >> > grace period by a little--I seem to recall the spec giving some leeway > >> > for this kind of thing. > >> > >> > >> Section 8.4.2.1. > >> > >> > So for example the server could have a heuristics like: extend the grace > >> > period by another second each time we notice there's been an EXCHANGE_ID > >> > or reclaim in the previous second, up to some maximum. And I suppose it > >> > could also delay the grace period until someone actually attempts a > >> > non-reclaim open. > >> > > >> > In isolation a single client slipping in the end like that sounds like a > >> > freak event, but if there's a ton of state to reclaim perhaps it could > >> > become more likely. > >> > > >> > I don't think that's a priority, we might just want to make sure we know > >> > how to do that in the future. > >> > > >> > But now that I think about it I don't see the existing or proposed > >> > nfsdcltrack stuff tying our hands in any way here. It just gives the > >> > kernel some extra information, and the kernel still has discretion about > >> > when exactly it wants to end the grace period. > >> > > >> > >> It is even allowed to grant reclaim lock attempts after the grace > >> period has ended _if_ and only if it can guarantee that no conflicting > >> locks were issued. > >> > >> However note that the NFSv4.1 client is not actually allowed to issue > >> non-reclaim lock requests before it has issued a RECLAIM_COMPLETE. I > >> dunno how religiously we stick to that in Linux (I think we do), but > >> the point is that the server can and should rely on the client > >> _always_ sending a RECLAIM_COMPLETE if it is going to establish new > >> locks. > > > > Yeah, I'm pretty sure that bit is enforced. The problem situation that > > I think Bruce was referring to is this: > > > > Server reboots. Client1 reclaims some of its locks (but not all) and > > never sends a RECLAIM_COMPLETE. Grace period ends and then server > > hands out a lock to client2 that was previously held by client1 but > > that didn't get reclaimed. > > > > Server reboots again, prior to the client1 expiring (so its record is > > still in the DB). Now client1 comes back and starts reclaiming again. > > This time it reclaims all of its locks and we have a conflict between > > it and client2. > > > > It's a solvable problem, but I'll need to work through how best to do > > so. > > > > -- > > That's the first edge condition described in section 8.4.3. > Actually, it's case #2 I think... -- Jeff Layton <jlayton@xxxxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html