Re: what on earth is going on here? paths above mountpoints turn into "(unreachable)"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10 Feb 2015, J. Bruce Fields said:

> On Tue, Feb 10, 2015 at 05:48:48PM +0000, Nix wrote:
>> On 5 Feb 2015, NeilBrown spake thusly:
>> 
>> > On Wed, 04 Feb 2015 23:28:17 +0000 Nix <nix@xxxxxxxxxxxxx> wrote:
>> >> It doesn't. It still recurs.
>> >
>> > Is /usr/archive still exported to mutilate with crossmnt?
>> > If it is, can you change to not do that (it is quite possible to have
>> > different export options for different clients).
>> 
>> OK. Adjusted.
>> 
>> > I think that if crossmnt is enabled on the server, then explicitly
>> > mounting /usr/archive/series will have the same net effect as not doing so
>> > (though I'm not 100% certain).
>> >
>> > Also, can you try changing
>> >    /proc/sys/fs/nfs/nfs_mountpoint_timeout
>> >
>> > It defaults to 500 (seconds - time for light from Sun to reach Earth).
>> > If you make it smaller and the problem gets worse, or make it much bigger
>> > and the problem goes away, that would be interesting.
>> > If it makes no difference, that also would be interesting.
>> 
>> Seems to make no difference, which is distinctly surprising. If
>> anything, it happens more often at the default value than at either the
>> high or low values. It's very erratic: it happened ten times in one day,
>> then three days passed and it didn't happen at all... system under
>> very similar load the whole time.
>> 
>> >From other prompts, what I'm seeing now -- but wasn't then, before I
>> took the crossmnt out -- is an epidemic of spontaneous unmounting: i.e.,
>> /usr/archive/series suddenly vanishes until remounted.
>> 
>> I might just reboot all systems involved in this mess and hope it goes
>> away. I have no *clue* what's going on, I've never seen it before, maybe
>> it'll stop if I no longer believe in it.
>
> It might be interesting to see output from
>
> 	rpc.debug -m rpc -s cache
> 	cat /proc/net/rpc/nfsd.export/content
> 	cat /proc/net/rpc/nfsd.fh/content
>
> especially after the problem manifests.

It's manifested right now, as a matter of fact.

# cat /proc/net/rpc/nfsd.export/content
#path domain(flags)
/usr/src        mutilate.wkstn.nix(rw,no_root_squash,async,wdelay,no_subtree_check,fsid=16,uuid=333950aa:8e3f440a:bc94d0cc:4adae198,sec=1)
/usr/share/texlive      mutilate.wkstn.nix(rw,no_root_squash,async,wdelay,fsid=7,uuid=5cccc224:a92440ee:b4450447:3898c2ec,sec=1)
/home/.spindle.srvr.nix mutilate.wkstn.nix(rw,no_root_squash,async,wdelay,no_subtree_check,fsid=1,uuid=95bd22c2:253c456f:8e36b6cf:b9ecd4ef,sec=1)
/usr/archive/series     *.srvr.nix,xios.srvr.nix(ro,insecure,root_squash,async,wdelay,no_subtree_check,fsid=29,uuid=543a1ca9:d17246ca:b6c53092:5896549d,sec=1)
/usr/lib/X11/fonts      mutilate.wkstn.nix(ro,root_squash,async,wdelay,fsid=12,uuid=5cccc224:a92440ee:b4450447:3898c2ec,sec=1)
/home/.spindle.srvr.nix *.srvr.nix,fold.srvr.nix(rw,root_squash,async,wdelay,no_subtree_check,fsid=1,uuid=95bd22c2:253c456f:8e36b6cf:b9ecd4ef,sec=1)
/usr/archive    mutilate.wkstn.nix(rw,insecure,root_squash,async,wdelay,fsid=25,uuid=d20e3edd:06a54a9b:85dcfa19:62975969,sec=1)

# note: no /usr/archive/series, though I mounted it on mutilate and did
# not unmount it: however, it no longer appears in /proc/mounts on
# mutilate and appears as an empty directory under /usr/archive.
# However, it *does* appear here:

# cat /proc/net/rpc/nfsd.fh/content
#domain fsidtype fsid [path]
*.srvr.nix,xios.srvr.nix 1 0x0000001d /usr/archive/series
mutilate.wkstn.nix 1 0x0000000f /etc/shai-hulud
mutilate.wkstn.nix 1 0x0000000b /pkg/non-free
mutilate.wkstn.nix 1 0x00000016 /usr/share/emacs/site-lisp
mutilate.wkstn.nix 1 0x00000012 /usr/share/httpd/htdocs/munin
mutilate.wkstn.nix 1 0x00000013 /usr/share/clamav
mutilate.wkstn.nix 1 0x0000000a /usr/share/nethack
mutilate.wkstn.nix 1 0x00000009 /usr/share/xplanet
mutilate.wkstn.nix 1 0x00000008 /usr/share/xemacs
mutilate.wkstn.nix 1 0x00000015 /usr/share/flightgear
mutilate.wkstn.nix 1 0x00000005 /usr/doc
mutilate.wkstn.nix 1 0x00000006 /usr/info
mutilate.wkstn.nix 1 0x00000011 /var/state/munin
mutilate.wkstn.nix 1 0x0000000e /var/log.real
mutilate.wkstn.nix 1 0x00000007 /usr/share/texlive
mutilate.wkstn.nix 1 0x00000010 /usr/src
mutilate.wkstn.nix 1 0x0000000c /usr/lib/X11/fonts
mutilate.wkstn.nix 1 0x00000019 /usr/archive
mutilate.wkstn.nix 1 0x0000001d /usr/archive/series
mutilate.wkstn.nix 1 0x00000001 /home/.spindle.srvr.nix
*.srvr.nix,fold.srvr.nix 1 0x00000001 /home/.spindle.srvr.nix

When this happens, I get an (unreachable) and broken symlink under /proc
(not really surprising as the mountpoint has gone) -- but in this
situation, cd'ing out and back in does not fix it, only a remount does.
I'm not surprised by *those* symptoms at all.

> Also, /usr/archive/series is a separate filesystem from /usr/archive,
> right?  (The output of "mount" run on the server might also be useful.)

They are separate server filesystems:

/dev/mapper/main-archive /usr/archive ext4 rw,nosuid,nodev,relatime,nobarrier,commit=30,data=ordered 0 0
/dev/sdc1 /usr/archive/series ext4 rw,nosuid,nodev,relatime,commit=30,data=ordered 0 0
/dev/mapper/main-winbackup /usr/archive/winbackup ext4 rw,nosuid,nodev,relatime,nobarrier,commit=30,data=ordered 0 0

> The reason crossmnt is considered "bad and evil" is that nfsv2 and v3
> clients don't necessarily expect mountpoints within exports, and may be
> get confused when (for example), they discover to files with the same
> inode number that appear to be on the same filesystem.

That I expected. NFS mounts within NFS mounts are presumably fine (I
hope so, I've been using them extensively for decades).

> I'm  not actually sure what the current linux client does--I think it
> may be smart enough to use the fsid to avoid at least some of those
> problems.  But NFSv4 clients are the only ones that should really be
> counted on to get this right.

I wish I could get NFSv4 to work. It's just screamed about a lack of
adequate authentication every time I've tried it, and my network is so
NFS-dependent that significant experimentation is difficult (getting
anything wrong tends to cause my entire desktop to deadlock in seconds).
I suppose I should set up some VMs and play in there :)

-- 
NULL && (void)
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux