Re: Optimal NFS mount options to safely allow interrupts and timeouts on newer kernels

"Brian Hawley" <bhawley@xxxxxxxxxxx> · Thu, 6 Mar 2014 05:47:19 +0000

I ended up writing a "manage_mounts" script run by cron that compares /proc/mounts and the fstab, used ping, and "timeout" messages in /var/log/messages to identify filesystems that aren't responding, repeatedly do umount -f to force i/o errors back to the calling applications; and when missing mounts (in fstab but not /proc/mounts) but were now pingable, attempt to remount them.

For me, timeo and retrans are necessary, but not sufficient.  The chunking to rsize/wsize and caching plays a role in how well i/o errors get relayed back to the applications doing the i/o.

You will certainly lose data in these scenario's.

It would be fantastic if somehow the timeo and retrans were sufficient (ie when they fail, i/o errors get back to the applications that queued that i/o (or even the i/o that cause the application to pend because the rsize/wsize or cache was full).   

You can eliminate some of that behavior with sync/directio, but performance becomes abysmal.

I tried "lazy" it didn't provide the desired effect (they unmounted which prevented new i/o's; but existing I/o's never got errors).

-----Original Message-----
From: NeilBrown <neilb@xxxxxxx>
Sender: linux-nfs-owner@xxxxxxxxxxxxxxx
Date: 	Thu, 6 Mar 2014 16:37:21 
To: Andrew Martin<amartin@xxxxxxxxxxx>
Cc: <linux-nfs@xxxxxxxxxxxxxxx>
Subject: Re: Optimal NFS mount options to safely allow interrupts and
 timeouts on newer kernels

On Wed, 5 Mar 2014 23:03:43 -0600 (CST) Andrew Martin <amartin@xxxxxxxxxxx>
wrote:

> ----- Original Message -----
> > From: "NeilBrown" <neilb@xxxxxxx>
> > To: "Andrew Martin" <amartin@xxxxxxxxxxx>
> > Cc: linux-nfs@xxxxxxxxxxxxxxx
> > Sent: Wednesday, March 5, 2014 9:50:42 PM
> > Subject: Re: Optimal NFS mount options to safely allow interrupts and timeouts on newer kernels
> > 
> > On Wed, 5 Mar 2014 11:45:24 -0600 (CST) Andrew Martin <amartin@xxxxxxxxxxx>
> > wrote:
> > 
> > > Hello,
> > > 
> > > Is it safe to use the "soft" mount option with proto=tcp on newer kernels
> > > (e.g
> > > 3.2 and newer)? Currently using the "defaults" nfs mount options on Ubuntu
> > > 12.04 results in processes blocking forever in uninterruptable sleep if
> > > they
> > > attempt to access a mountpoint while the NFS server is offline. I would
> > > prefer
> > > that NFS simply return an error to the clients after retrying a few times,
> > > however I also cannot have data loss. From the man page, I think these
> > > options
> > > will give that effect?
> > > soft,proto=tcp,timeo=10,retrans=3
> > > 
> > > >From my understanding, this will cause NFS to retry the connection 3 times
> > > >(once
> > > per second), and then if all 3 are unsuccessful return an error to the
> > > application. Is this correct? Is there a risk of data loss or corruption by
> > > using "soft" in this way? Or is there a better way to approach this?
> > 
> > I think your best bet is to use an auto-mounter so that the filesystem gets
> > unmounted if the server isn't available.
> Would this still succeed in unmounting the filesystem if there are already
> processes requesting files from it (and blocking in uninterruptable sleep)?

The kernel would allow a 'lazy' unmount in this case.  I don't know if any
automounter would try a lazy unmount though - I suspect not.

A long time ago I used "amd" which would create syslinks to a separate tree
where the filesystems were mounted.  I'm pretty sure that when a server went
away the symlink would disappear even if the unmount failed.
So while any processes accessing the filesystem would block, new processes
would not be able to find the filesystem and so would not block.

> 
> > "soft" always implies the risk of data loss.  "Nulls Frequently Substituted"
> > as it was described to very many years ago.
> > 
> > Possibly it would be good to have something between 'hard' and 'soft' for
> > cases like yours (you aren't the first to ask).
> > 
> >  From http://docstore.mik.ua/orelly/networking/puis/ch20_01.htm
> > 
> >    BSDI and OSF /1 also have a spongy option that is similar to hard , except
> >    that the stat, lookup, fsstat, readlink, and readdir operations behave
> >    like a soft MOUNT .
> > 
> > Linux doesn't have 'spongy'.  Maybe it could.  Or maybe it was a failed
> > experiment and there are good reasons not to want it.
> 
> The problem that sparked this question is a webserver where apache can serve
> files from an NFS mount. If the NFS server becomes unavailable, then the apache
> processes block in uninterruptable sleep and drive the load very high, forcing
> a server restart. It would be better for this case if the mount would simply 
> return an error to apache, so that it would give up rather than blocking 
> forever and taking down the system. Can such behavior be achieved safely?

If you have a monitoring program that notices this high load you can try
  umount -f /mount/point

The "-f" should cause outstanding requests to fail.  That doesn't stop more
requests being made though so it might not be completely successful.
Possibly running it several times would help.

  mount --move /mount/point /somewhere/safe
  for i in {1..15}; do umount -f /somewhere/safe; done

might be even better, if you can get "mount --move" to work.  It doesn't work
for me, probably the fault of systemd (isn't everything :-)).

NeilBrown

ÿôèº{.nÇ+?·?®??+%?Ëÿ±éÝ¶¥?wÿº{.nÇ+?·¥?{±þwìþ)í?æèw*jg¬±¨¶????Ý¢jÿ¾«þG«?éÿ¢¸¢·¦j:+v?¨?wèjØm¶?ÿþø¯ù®w¥þ?àþf£¢·h??â?úÿ?Ù¥