On Jan 25, 2012, at 1:55 PM, J. Bruce Fields wrote: > On Wed, Jan 25, 2012 at 12:41:27PM -0500, Chuck Lever wrote: >> If SETCLIENTID returns a unique clientid4 that a client hasn't seen from other servers, the client knows that's a unique server instance which must be recovered separately after a reboot. > > Hm, but does it have to do the recovery with that server? If a client has a lease and open state on that server, it should do recovery if the server reboots. > And if so, then how does that fit with failover? We were supposed to discuss that with Bill and Piyush. Maybe we can bring it up again at Connectathon. But my assumption is that fail over is supposed to look like a server reboot. The question is what clients does the server allow to recover, and which does it force to start fresh? Shouldn't it be enough for a server to remember nfs_client_id4 strings? > I mean, suppose the whole cluster is rebooted. From the client's point > of view, its server becomes unresponsive. So it should probably start > pinging the replicas to see if another one's up. The first server it > gets a response from won't necessarily be the one it was using before. > What happens next? Again, it depends on whether your clustering implementation shares state among all servers in the cluster. Sharing state across a cluster seems like a difficult implementation choice. For simplicity I think each server in a cluster should manage its own leases. NFSv4.0 replication is generally for read-only datasets anyway. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html