Hi Alan, Thanks for getting back. On Wed, 2011-08-17 at 12:00 +0100, Alan Brown wrote: > Colin Simpson wrote: > > > ,when the service is stopped I get a "Stale NFS file handle" from > > mounted filesystems accessing the NFS mount point at those times. > i.e. > > if I have a copy going I get on the service being disabled: > > That's normal if a NFS server mount is unexported or nfsd shuts down. > > It _should_ (but doesn't always) clear when NFS resumes. It does clear the "stale NFS file handle" when the service fails over. But that's not really the issue for me. My beef is that it seems that as the "stale NFS handle" will be liable to cause apps on clients to get upset (will possibly just quit), whereas the hang will suspend client apps looking at this mount point until the service failsover. Seems better. > > Why is there a behaviour disparity? Which is correct? > > They're both correct - and both wrong. :) But it seems a subtle change to an NFS service setup in the config i.e the IP containing the NFS export and client vs the IP sitting at the top level (i.e at the same level as NFS export), results in the NFS behaving like a hard mount vs a soft mount (even though I'm mounting as hard in both cases from the clients). Maybe I'm confused, just seems pretty unclear as to why the behaviour should be different. The config fragment you gave behaves for me properly (IMHO) and the clients hang until service failover (so exactly like my first case IP ref contains the NFS export etc) > on server side, restarting the nfslock service is usually sufficent to > get umount to work (It's safe, clients are told to reacquire their > locks) > What's your backend? GFS? > > > But more seriously I can't easily shut the cluster down cleanly when > > told to by a UPS on power outage. Shutting down the node will be > unable > > to be performed cleanly as a resource is open (so will be liable to > > fencing). > > If the filesystem is GFS, fencing is about the only reliable way of > leaving the cluster. > The backend I'm trying is ext4 (failing over the mount). I had tried to manually stop nfsd and nfslock (even though the cluster seems to drop locks anyway from the output it's writing in the messages file) and that make no difference sadly, still fails to umount. Even after leaving it for ages. The failure to umount only seems to occur if you are actively performing a large amount of continuous activity to this NFS export (copying a large file over when it fails or is stopped). I wonder if this hanging isn't unexpected with NFS given the "self_fence" option provided in the fs resources? Thanks again Colin This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster