Re: multiple instances of rpc.statd

Wendy Cheng <s.wendy.cheng@xxxxxxxxx> · Sun, 27 Apr 2008 22:59:11 -0500

J. Bruce Fields wrote:
On Fri, Apr 25, 2008 at 09:47:03AM -0400, Wendy Cheng wrote:

Bernd Schubert wrote:

Hello,

on servers with heartbeat managed resources one rather often has the 
situation one exports different directories from different resources.

It now may happen all resources are running on one host, but they can 
also run from different hosts. The situation gets even more complicated 
if the server is also a nfs client.

In principle having different nfs resources works fine, only the statd 
state directory is a problem. Or in principle the statd concept at all. 
Actually we would need to have several instances of statd running using 
different directories. These then would have to be migrated from one 
server to the other on resource movement. However, as far I understand 
it, there does not even exist the basic concept for this, doesn't it? 

The efforts have been attempted (to remedy this issue) and a complete  
set of patches have been (kept) submitting for the past two years. The   
patch acceptance progress is very slow (I guess people just don't want  
to get bothered with cluster issues ?).

We definitely want to get this all figured out....

Anyway, the kernel side has the basic infrastructure to handle the  
problem (it stores the incoming clients IP address as part of its  
book-keeping record) - just a little bit tweak will do the job. However,  
the user side statd directory needs to get re-structured. I didn't  
publish the user side directory structure script during my last round of  
submission. Forking statd into multiple threads do not solve all the  
issues. Check out:
https://www.redhat.com/archives/cluster-devel/2007-April/msg00028.html

So for basic v2/v3 failover, what remains is some statd -H scripts, and
some form of grace period control?  Is there anything else we're
missing?

The submitted patch set is reasonably complete ... .

There was another thought about statd patches though - mostly because of
the concerns over statd's responsiveness. It depended so much on network
status and clients' participations.  I was hoping NFS V4 would catch up
by the time v2/v3 grace period patches got accepted into mainline
kernel. Ideally the v2/v3 lock reclaiming logic could use (or at least
did a similar implementation) the communication channel established by
v4 servers - that is,

1. Enable grace period as previous submitted patches on secondary server.
2. Drop the locks on primary server (and chained the dropped locks into
a lock-list).
3. Send the lock-list via v4 communication channel (or similar
implementation) from primary server to backup server.
4. Reclaim the lock base on the lock-list on backup server.

In short, it would be nice to replace the existing statd lock reclaiming
logic with the above steps if all possible during active-active
failover. For reboot, on the other hand, should stay same as today's
statd logic without changes.

-- Wendy

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html