On Thu, 2012-04-12 at 11:54 -0400, Chuck Lever wrote: > On Apr 12, 2012, at 11:50 AM, Myklebust, Trond wrote: > > > On Thu, 2012-04-12 at 11:42 -0400, Chuck Lever wrote: > >> Hi- > >> > >> Changing the SETCLIENTID boot verifier so it is global for the whole client exposes a problem with how we allocate state owners. > >> > >> A quick umount / mount sequence destroys all state on the client. But since the client now always uses the same boot verifier and nfs_client_id4 string, the server no longer recognizes a client reboot. FOr a fresh mount, the client may perform a SETCLIENTID, but it is treated as a callback update (state is not purged) if the client's lease has not yet expired. > >> > >> Our state owners are generated from a pair of ida structures in the nfs_server for that mount. They always start from zero after a mount operation. Likewise, the sequence IDs for these state owners are also reset by umount / mount. Note that each mount point gets a fresh nfs_server, so these structures are not retained across umount / mount. > >> > >> This means umount / mount with no lease expiry starts to re-play state owners with reset sequence IDs. Servers don't really care for that behavior. I have a test case that reliably gets a BAD_SEQID error from a server after a quick umount / mount followed by a single file creation. > >> > >> Now that we are about to switch to using more-or-less global SETCLIENTID boot verifiers, it seems to me that we really want a global openowner_id and lockowner_id as well. > >> > >> The performance impact of such a change might be acceptable because we cache and reuse state owners now. > >> > >> Thoughts? > > > > That's a definite server bug. If the client holds no open state, then it > > is allowed to forget the open owner and start the sequence id from 0 > > again. It is not required to remember sequence ids for open owners that > > aren't in use. > > > > Our current client could easily trigger this bug even without a > > umount/mount. > > The client is holding open state. Here's the exact reproducer on my modified client: > > 1. mount server:/export /mnt > 2. touch /mnt/newfile > 3. umount /mnt > 4. mount server:/export /mnt > 5. touch /mnt/newfile2 > > Step 5 causes the client to replay an open owner with a reset sequence ID, and the server replies BAD_SEQID. touch won't keep the file open. There is no open state once touch has finished executing. What you have exposed above is a _server_ bug. The server is _not_ allowed to assume that the client will cache an open owner forever once it no longer holds any open state using that open owner. We had a loong discussion about this on the mailing list a few years ago with David Robinson being the person who formulated the above rule. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥