Re: global openowner_id and lockowner_id

"Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> · Thu, 12 Apr 2012 17:32:54 +0000

On Thu, 2012-04-12 at 12:05 -0400, Chuck Lever wrote:
> On Apr 12, 2012, at 11:58 AM, Myklebust, Trond wrote:
> 
> > On Thu, 2012-04-12 at 11:54 -0400, Chuck Lever wrote:
> >> On Apr 12, 2012, at 11:50 AM, Myklebust, Trond wrote:
> >> 
> >>> On Thu, 2012-04-12 at 11:42 -0400, Chuck Lever wrote:
> >>>> Hi-
> >>>> 
> >>>> Changing the SETCLIENTID boot verifier so it is global for the whole client exposes a problem with how we allocate state owners.
> >>>> 
> >>>> A quick umount / mount sequence destroys all state on the client.  But since the client now always uses the same boot verifier and nfs_client_id4 string, the server no longer recognizes a client reboot.  FOr a fresh mount, the client may perform a SETCLIENTID, but it is treated as a callback update (state is not purged) if the client's lease has not yet expired.
> >>>> 
> >>>> Our state owners are generated from a pair of ida structures in the nfs_server for that mount.  They always start from zero after a mount operation.  Likewise, the sequence IDs for these state owners are also reset by umount / mount.  Note that each mount point gets a fresh nfs_server, so these structures are not retained across umount / mount.
> >>>> 
> >>>> This means umount / mount with no lease expiry starts to re-play state owners with reset sequence IDs.  Servers don't really care for that behavior.  I have a test case that reliably gets a BAD_SEQID error from a server after a quick umount / mount followed by a single file creation.
> >>>> 
> >>>> Now that we are about to switch to using more-or-less global SETCLIENTID boot verifiers, it seems to me that we really want a global openowner_id and lockowner_id as well.
> >>>> 
> >>>> The performance impact of such a change might be acceptable because we cache and reuse state owners now.
> >>>> 
> >>>> Thoughts?
> >>> 
> >>> That's a definite server bug. If the client holds no open state, then it
> >>> is allowed to forget the open owner and start the sequence id from 0
> >>> again. It is not required to remember sequence ids for open owners that
> >>> aren't in use.
> >>> 
> >>> Our current client could easily trigger this bug even without a
> >>> umount/mount.
> >> 
> >> The client is holding open state.  Here's the exact reproducer on my modified client:
> >> 
> >> 1.  mount server:/export /mnt
> >> 2.  touch /mnt/newfile
> >> 3.  umount /mnt
> >> 4.  mount server:/export /mnt
> >> 5.  touch /mnt/newfile2
> >> 
> >> Step 5 causes the client to replay an open owner with a reset sequence ID, and the server replies BAD_SEQID.
> > 
> > touch won't keep the file open. There is no open state once touch has
> > finished executing.
> 
> OK, agreed.
> 
> > What you have exposed above is a _server_ bug. The server is _not_
> > allowed to assume that the client will cache an open owner forever once
> > it no longer holds any open state using that open owner. We had a loong
> > discussion about this on the mailing list a few years ago with David
> > Robinson being the person who formulated the above rule.
> 
> I'm not sure I would characterize this as a server bug just yet.  On OPEN, the server is allowed to tell the client it is using a bad sequence ID, and the client is supposed to recover by trying again with a different OO.  Our BAD_SEQID recovery logic appears to be broken, because our client goes into a loop retrying the OPEN with the same OO.  If recovery worked, this would all be perfectly transparent, I think.

After re-reading the thread on
http://www.ietf.org/mail-archive/web/nfsv4/current/msg01719.html I'm
having second thoughts. I did remember being convinced by David's
arguments, but looking back it does not appear that we achieved
consensus.

The whole problem in the spec is that although the server is allowed to
forget the open owner when the client no longer holds any state, it is
not _required_ to do so.
In other words, the client needs to be prepared for either BAD_SEQID or
an OPEN_CONFIRM request if it tries to OPEN using a recycled open owner.

> I was taking a step back and wondering how the client chose the OO in the first place.
> 
> But you claimed above that our client could trigger this bug without a umount / mount sequence.  Do you have an example of how I might try that?

The latest client uses the ida allocator in order to generate unique
identifiers. That particular allocator offers no guarantees that it
won't re-use an identifier that is no longer in use. In fact, because it
uses find_next_zero_bit(), reuse is actually the norm.

So I believe that all you need to do here is create 2 open owners using
simulteneous 'open()' calls from 2 processes with different credentials,
then close both file descriptors, wait 1 lease period, and then try
'open()' from both processes again.
As far as I can see, the garbage collector should throw out the second
open owner when you do the third 'open()', and so the client will
generate a new one with the same ida value when you do the fourth
open().

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com

��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥