Re: rapid clustered nfs server failover and hung clients -- how best to close the sockets?

"J. Bruce Fields" <bfields@xxxxxxxxxxxx> · Mon, 9 Jun 2008 13:23:13 -0400

On Mon, Jun 09, 2008 at 12:02:43PM -0400, Jeff Layton wrote:
> On Mon, 9 Jun 2008 11:51:36 -0400
> "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote:
> 
> > On Mon, Jun 09, 2008 at 10:31:37AM -0400, Jeff Layton wrote:
> > > I can think of 3 ways to fix this:
> > > 
> > > 1) Add something like the recently added "unlock_ip" interface that
> > > was added for NLM. Maybe a "close_ip" that allows us to close all
> > > nfsd sockets connected to a given local IP address. So clustering
> > > software could do something like:
> > > 
> > >     # echo 10.20.30.40 > /proc/fs/nfsd/close_ip
> > > 
> > > ...and make sure that all of the sockets are closed.
> > > 
> > > 2) just use the same "unlock_ip" interface and just have it also
> > > close sockets in addition to dropping locks.
> > > 
> > > 3) have an nfsd close all non-listening connections when it gets a
> > > certain signal (maybe SIGUSR1 or something). Connections on a
> > > sockets that aren't failing over should just get a RST and would
> > > reopen their connections.
> > > 
> > > ...my preference would probably be approach #1.
> > 
> > What do you see as the advantage of #1 over #2?  Are there cases where
> > someone would want to drop locks but not also close connections (or
> > vice-versa)?
> > 
> 
> There's no real advantage that I can see (maybe if they're running a
> cluster with no NLM services somehow). Mostly that "unlock_ip" seems to
> imply that it deals with locking, and this doesn't. I'd be OK with #2
> if it's a reasonable solution. Given what Chuck mentioned, it sounds
> like we'll also need to take care to make sure that existing calls
> complete and the replies get flushed out too, so this could be more
> complicated that I had anticipated.

It seems to me that in the long run what we'd like is a virtualized NFS
service--you should be able to start and stop independent "servers"
hosted on a single kernel, and to clients they should look like
completely independent servers.

And I guess the question is how little "virtualization" you can get away
with and still have the whole thing work.

But anyway, ideally I think there'd be a single interface that says
"shut down the nfs service provided via server ip x.y.z.w, for possible
migration to another host".  That's the only operation anyone really
want to do--independent control over the tcp connections, and the locks,
and the rpc cache, and whatever else needs to be dealt with, sounds
unlikely to be useful.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html