Re: how to properly handle failures during delegation recall process

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 05, 2014 at 07:41:58AM -0500, Trond Myklebust wrote:
> On Wed, Nov 5, 2014 at 6:57 AM, Jeff Layton <jeff.layton@xxxxxxxxxxxxxxx> wrote:
> > (cc'ing Tom here since we may want to consider providing guidance in
> >  the spec for this situation)
> >
> > Ok, I think both of you are right here :). Here's my interpretation:
> >
> > Olga is correct that the LOCK operation itself is safe since LOCK
> > doesn't actually modify the contents of the file. What it's not safe to
> > do is to trust that LOCK unless and until the DELEGRETURN is also
> > successful.
> >
> > First, let's clarify the potential race that Trond pointed out:
> >
> > Suppose we have a delegation and delegated locks. That delegation is
> > recalled and we do something like this:
> >
> > OPEN with DELEGATE_CUR: NFS4_OK
> > LOCK:                   NFS4_OK
> > LOCK:                   NFS4_OK
> > ...(maybe more successful locks here)...
> > DELEGRETURN:            NFS4ERR_ADMIN_REVOKED
> >
> > ...at that point, we're screwed.
> >
> > The delegation was obviously revoked after we did the OPEN but before
> > the DELEGRETURN. None of those LOCK requests can be trusted since
> > another client may have opened the file at any point in there, acquired
> > any one of those locks and then released it.
> >
> > For v4.1+ the client can do what Trond suggests. Check for
> > SEQ4_STATUS_RECALLABLE_STATE_REVOKED in each LOCK response. If it's set
> > then we can do the TEST_STATEID/FREE_STATEID dance. If the TEST_STATEID
> > fails, then we must consider the most recently acquired lock lost.
> > LOCKU it and give up trying to reclaim the rest of them.
> >
> > For v4.0, I'm not sure what the client can do other than wait until the
> > DELEGRETURN. If that fails with NFS4ERR_ADMIN_REVOKED, then we'll just
> > have to try to unwind the whole mess. Send LOCKUs for all of them and
> > consider them all to be lost.
> >
> > Actually, it may be reasonable to just do the same thing for v4.1. The
> > client tracks NFS_LOCK_LOST on a per-lockstateid basis, so once you have
> > any unreclaimable lock, any I/O done with that stateid is going to fail
> > anyway. You might as well just release any locks you do hold at that
> > point.
> >
> > The other question is whether the server ought to have any role to play
> > here. In principle it could track whether an open/lock stateid is
> > descended from a still outstanding delegation, and revoke those
> > stateids if the delegation is revoked. That would probably not be
> > trivial to do with the current Linux server implementation, however.

That sounds like a problem for whoever wants to implement support for
administrative revocation of state.  We don't really support it
currently.

Oops, right, except for the case where the delegation's revoked just
because the client ran out of time doing the recall.  In which case I
think the final error's going to be either EXPIRED (4.0) or
DELEG_REVOKED (4.1)?  (Except I think the Linux server's returning
BAD_STATEID in the 4.0 case, which looks wrong.)

--b.

> What the server could (and probably should) do is revoke all
> open/lock/layout state for the clientid+file combination for which it
> is also revoking the delegation. That means that all applications that
> were using that file on that client would be screwed, but they
> probably will be anyway if the file gets corrupted due to non-atomic
> locking.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux