Re: [PATCH v4 0/6] nfsd: overhaul the client name tracking code

Jeff Layton <jlayton@xxxxxxxxxx> · Wed, 25 Jan 2012 08:38:20 -0500

On Wed, 25 Jan 2012 08:11:17 -0500
"J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote:

> On Wed, Jan 25, 2012 at 06:41:58AM -0500, Jeff Layton wrote:
> > On Tue, 24 Jan 2012 18:08:55 -0500
> > "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote:
> > 
> > > On Mon, Jan 23, 2012 at 03:01:01PM -0500, Jeff Layton wrote:
> > > > This is the fourth iteration of this patchset. I had originally asked
> > > > Bruce to take the last one for 3.3, but decided at the last minute to
> > > > wait on it a bit. I knew there would be some changes needed in the
> > > > upcall, so by waiting we can avoid needing to deal with those in code
> > > > that has already shipped. I would like to see this patchset considered
> > > > for 3.4 however.
> > > > 
> > > > The previous patchset can be viewed here. That set also contains a
> > > > more comprehensive description of the rationale for this:
> > > > 
> > > >     http://www.spinics.net/lists/linux-nfs/msg26324.html
> > > > 
> > > > There have been a number of significant changes since the last set:
> > > > 
> > > > - the remove/expire upcall is now gone. In a clustered environment, the
> > > > records would need to be refcounted in order to handle that properly. That
> > > > becomes a sticky problem when you could have nodes rebooting. We don't
> > > > really need to remove these records individually however. Cleaning them
> > > > out only when the grace period ends should be sufficient.
> > > 
> > > I don't think so:
> > > 
> > > 	1. Client establishes state with server.
> > > 	2. Network goes down.
> > > 	3. A lease period passes without the client being able to renew.
> > > 	   The server expires the client and grants conflicting locks to
> > > 	   other clients.
> > > 	3. Server reboots.
> > > 	4. Network comes back up.
> > > 
> > > At this point, the client sees that the server has rebooted and is in
> > > its grace period, and reclaims.  Ooops.
> > > 
> > > The server needs to be able to tell the client "nope, you're not allowed
> > > to reclaim any more" at this point.
> > > 
> > > So we need some sort of remove/expire upcall.
> > > 
> > 
> > Doh! I don't know what I was thinking -- you're correct and we do need
> > that.
> > 
> > Ok, I'll see about putting it back and will resend. That does make it
> > rather nasty to handle clients mounting from multiple nodes in the same
> > cluster though. We'll need to come up with a data model that allows for
> > that as well.
> 
> Honestly, in the v4-based migration case if one client can hold state on
> mulitple nodes, and could (could it?) after reboot decide to reclaim
> state on a different node from the one it previously held the same state
> on--I'm not even clear what *should* happen, or if the protocol is
> really adequate for that case.
> 
> --b.

That was one of Chuck's concerns, IIUC:

--------------[snip]----------------

What if a server has more than one address?  For example, an IPv4 and
an IPv6 address?  Does it get two separate database files?  If so, how
do you ensure that a client's nfs_client_id4 is recorded in both places
atomically?  I'm not sure tying the server's identity to an IP address
is wise.

--------------[snip]----------------

This is the problem...

We need to tie the record to some property that's invariant for the NFS
server "instance". That can't be a physical nodeid or anything, since
part of the goal here is to allow for cluster services to float freely
between them.

I really would like to avoid having to establish some abstract "service
ID" or something since we'd have to track that on stable storage on a
per-nfs-service basis.

The server address seems like a natural fit here. With the design I'm
proposing, a client will need to reestablish its state on another node
if it migrates for any reason.

Chuck, what was your specific worry about tracking these on a per
server address basis? Can you outline a scenario where that would break
something?

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html