Re: [PATCH 3/8] mountd: remove 'dev_missing' checks

NeilBrown <neilb@xxxxxxxx> · Wed, 20 Jul 2016 08:50:12 +1000

On Tue, Jul 19 2016, J. Bruce Fields wrote:

> On Thu, Jul 14, 2016 at 12:26:43PM +1000, NeilBrown wrote:
>> I now think this was a mistaken idea.
>> 
>> If a filesystem is exported with the "mountpoint" or "mp" option, it
>> should only be exported if the directory is a mount point.  The
>> intention is that if there is a problem with one filesystem on a
>> server, the others can still be exported, but clients won't
>> incorrectly see the empty directory on the parent when accessing the
>> missing filesystem, they will see clearly that the filesystem is
>> missing.
>> 
>> It is easy to handle this correctly for NFSv3 MOUNT requests, but what
>> is the correct behavior if a client already has the filesystem mounted
>> and so has a filehandle?  Maybe the server rebooted and came back with
>> one device missing.  What should the client see?
>> 
>> The "dev_missing" code tries to detect this case and causes the server
>> to respond with silence rather than ESTALE.  The idea was that the
>> client would retry and when (or if) the filesystem came back, service
>> would be transparently restored.
>> 
>> The problem with this is that arbitrarily long delays are not what
>> people would expect, and can be quite annoying.  ESTALE, while
>> unpleasant, it at least easily understood.  A device disappearing is a
>> fairly significant event and hiding it doesn't really serve anyone.
>
> It could also be a filesystem disappearing because it failed to mount in
> time on a reboot.

I don't think "in time" is really an issue.  Boot sequencing should not
start nfsd until everything in /etc/fstab is mounted, has failed and the
failure has been deemed acceptable.
That is why nfs-server.services has "After= local-fs.target"

>
>> So: remove the code and allow ESTALE.
>
> I'm not completely sure I understand the justification.

"hangs are bad".

When you cannot get a reply from the NFS server there are multiple
possible causes from temporary network glitch to server-is-dead.
You cannot reliably differentiate, so you have to just wait.

There server itself doesn't have the same uncertainty about its exported
filesystems.  They are either working or they aren't.
So it is possible, and I think reasonable, to send a more definitive
reply - ESTALE.

This particularly became an issues with NFSv4.
With NFSv3, mounting the filesystem is distinct from accessing it.
So it was easy to fail a mount request but delay an access request.
With NFSv4 we don't have that distinction.  If we make accesses wait,
then we make mount attempts wait too, which isn't at all friendly.

>
> I don't like the current behavior either--I'd be happy if we could
> deprecate "mountpoint" entirely--but changing it now would seem to risk
> regressions if anyone depends on it.

True.  There isn't really an easy solution there.

"mountpoint" seemed like a good idea when I wrote it.  But I never got
any proper peer review.

Thanks,
NeilBrown
Attachment:
signature.asc

Description: PGP signature