Re: When ghosting is enabled, failed mounts by AutoFS leave broken directories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2012-09-25 at 18:42 -0300, Leonardo Chiquitto wrote:
> On Mon, Sep 24, 2012 at 11:13 PM, Ian Kent <raven@xxxxxxxxxx> wrote:
> > On Mon, 2012-09-24 at 17:14 -0300, Leonardo Chiquitto wrote:
> >> Hi,
> >>
> >> If you try to access an automounted directory and the mount operation fails,
> >> the attributes of the directory (owner, group, mode, access time) are lost.
> >> This happens only when AutoFS' browse mode (ghosting) is enabled.
> >>
> >> How to reproduce the problem:
> >>
> >> Make sure you have BROWSE_MODE="yes" in /etc/sysconfig/autofs.
> >>
> >> 1. Add a simple indirect mount map (it's required that the server does NOT
> >>    export the /notexported directory):
> >>
> >> # cat /etc/auto.master
> >> /nfs    /etc/auto.nfs
> >>
> >> # cat /etc/auto.nfs
> >> v2      server:/notexported
> >>
> >> 2. Restart AutoFS and try to access the directory:
> >>
> >> # cd /nfs/v2
> >> -bash: cd: /nfs/v2: No such file or directory
> >>
> >> # ls -la /nfs
> >> ls: cannot access /nfs/v2: No such file or directory
> >> total 4
> >> drwxr-xr-x  3 root root    0 Sep 24 21:43 .
> >> drwxr-xr-x 28 root root 4096 Sep 24 21:25 ..
> >> d?????????  ? ?    ?       ?            ? v2
> >>
> >> This problem is reproducible on recent upstream kernels such as 3.6.0-rc6.
> >> I found out that the last kernel version that does not exhibit the bug is
> >> 2.6.37. Between 2.6.37 and 2.6.38 there were quite some changes in
> >> AutoFS/VFS and my attempts to bisect the problem were frustrated by
> >> early crashes.
> >>
> >> Any ideas?
> >
> > Not really.
> >
> > It would be best to start with an full autofs debug log so we can see
> > what is happening within automount and work back from there. We'll
> > probably need to look at the dentry itself as well to see if we can get
> > any info about where it came from possibly, with a kernel dump.
> >
> > Ian
> 
> Hi Ian,
> 
> Here are the debug logs from automount and a snapshot of the dentry
> and inode of the "broken" directory (running 2.6.38).
> 
> I tried to compare it with 2.6.37 (which works), but the structures
> changed a lot between these kernels.

A great deal changed in 2.6.38 in the VFS.

You really can't use 2.6.38 because there were a number of fixes that
went in after this and I think the VFS code up to around 3.0 or 3.1
could have had this problem anyway.

What your looking for is kernels with fs/namei.c:follow_automount() with
this near the top of the function, which shouldn't cause this type of
problem:

       /* We don't want to mount if someone's just doing a stat -
         * unless they're stat'ing a directory and appended a '/' to
         * the name.
         *
         * We do, however, want to mount if someone wants to open or
         * create a file of any type under the mountpoint, wants to
         * traverse through the mountpoint or wants to open the
         * mounted directory.  Also, autofs may mark negative dentries
         * as being automount points.  These will need the attentions
         * of the daemon to instantiate them before they can be used.
         */
        if (!(flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY |
                     LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) &&
            path->dentry->d_inode)
                return -EISDIR;

But I'm seeing this with 3.4.9-2.fc16 which does have this in
follow_automount() so that's a puzzle.

> 
> ## starting up
> 
> Sep 25 13:44:30 n1 automount[2337]: Starting automounter version
> 5.0.7, master map auto.master
> Sep 25 13:44:30 n1 automount[2337]: using kernel protocol version 5.02
> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_master: reading
> master files auto.master
> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init
> gathered global options: (null)
> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected,
> passing -n to mount
> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected,
> passing -n to mount
> Sep 25 13:44:30 n1 automount[2337]: lookup_read_master: lookup(file):
> read entry /nfs
> Sep 25 13:44:30 n1 automount[2337]: master_do_mount: mounting /nfs
> Sep 25 13:44:30 n1 automount[2337]: automount_path_to_fifo: fifo name
> /run/autofs.fifo-nfs
> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_map: reading map
> file /etc/auto.nfs
> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init
> gathered global options: (null)
> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected,
> passing -n to mount
> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected,
> passing -n to mount
> Sep 25 13:44:30 n1 automount[2337]: mounted indirect on /nfs with
> timeout 60, freq 15 seconds
> Sep 25 13:44:30 n1 automount[2337]: st_ready: st_ready(): state = 0 path /nfs
> Sep 25 13:44:30 n1 automount[2337]: ghosting enabled
> 
> ## ls /nfs/v2
> 
> Sep 25 13:44:44 n1 automount[2337]: handle_packet: type = 3
> Sep 25 13:44:44 n1 automount[2337]: handle_packet_missing_indirect:
> token 6, name v2, request pid 2352
> Sep 25 13:44:44 n1 automount[2337]: attempting to mount entry /nfs/v2
> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): looking up v2
> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): v2 ->
> server:/nfs
> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): expanded
> entry: server:/nfs
> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): gathered options:
> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun):
> dequote("server:/nfs") -> server:/nfs
> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): core of
> entry: options=, loc=server:/nfs
> Sep 25 13:44:44 n1 automount[2337]: sun_mount: parse(sun): mounting
> root /nfs, mountpoint v2, what server:/nfs, fstype nfs, options (null)
> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): root=/nfs
> name=v2 what=server:/nfs, fstype=nfs, options=(null)
> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host
> server(10.1.1.4) proto tcp version 0x30
> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.000214
> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000114
> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 164 weight 0
> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host
> server(10.1.1.4) proto udp version 0x30
> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.001836
> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000095
> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 965 weight 0
> Sep 25 13:44:44 n1 automount[2337]: prune_host_list: selected subset
> of hosts that support NFS3 over TCP
> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling
> mkdir_path /nfs/v2
> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling
> mount -t nfs server:/nfs /nfs/v2
> Sep 25 13:44:44 n1 automount[2337]: spawn_mount: mtab link detected,
> passing -n to mount
> Sep 25 13:44:44 n1 checkproc: checkproc: Empty pid file
> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd
> Sep 25 13:44:44 n1 start_daemon: start_daemon: Empty pid file
> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd
> Sep 25 13:44:44 n1 rpc.statd[2369]: Version 1.2.6 starting
> Sep 25 13:44:44 n1 rpc.statd[2369]: Flags: TI-RPC
> Sep 25 13:44:44 n1 automount[2337]: >> Starting rpc.statd .....done
> Sep 25 13:44:44 n1 automount[2337]: >> mount.nfs: access denied by
> server while mounting server:/nfs
> Sep 25 13:44:44 n1 automount[2337]: mount(nfs): nfs: mount failure
> server:/nfs on /nfs/v2
> Sep 25 13:44:44 n1 automount[2337]: dev_ioctl_send_fail: token = 6
> Sep 25 13:44:44 n1 automount[2337]: failed to mount /nfs/v2
> Sep 25 13:44:54 n1 automount[2337]: st_expire: state 1 path /nfs
> Sep 25 13:44:54 n1 automount[2337]: expire_proc: exp_proc =
> 140484187907840 path /nfs
> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: got thid
> 140484187907840 path /nfs stat 0
> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: sigchld: exp
> 140484187907840 finished, switching from 2 to 1
> Sep 25 13:44:54 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs
> Sep 25 13:45:09 n1 automount[2337]: st_expire: state 1 path /nfs
> Sep 25 13:45:09 n1 automount[2337]: expire_proc: exp_proc =
> 140484187907840 path /nfs
> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: got thid
> 140484187907840 path /nfs stat 0
> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: sigchld: exp
> 140484187907840 finished, switching from 2 to 1
> Sep 25 13:45:09 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs
> 
> ## ls -la /nfs
> 
> Sep 25 13:45:12 n1 automount[2337]: handle_packet: type = 3
> Sep 25 13:45:12 n1 automount[2337]: handle_packet_missing_indirect:
> token 7, name v2, request pid 2397
> Sep 25 13:45:12 n1 automount[2337]: dev_ioctl_send_fail: token = 7

This callback shouldn't happen.

> 
> dentry and inode (/nfs/v2) from a v2.6.38 kernel, after triggering
> the mount:

This certainly looks like an autofs dentry.

> 
> struct dentry {
>   d_flags = 393216,
>   d_seq = {
>     sequence = 4
>   },
>   d_hash = {
>     next = 0x0,
>     pprev = 0xffffc900011f5680
>   },

This looks a bit odd but I don't think it is the cause of the problem.

Other than that these look ok.

snip ...

I setup a simple map, auto.test, with a single key "this" to check this
out and did an "strace -ff -o ls.trace /bin/ls --color=auto /test"
against the 3.4.9-2.fc16 I currently have installed.

Ignoring all the cruft, near the bottom we see:
stat64("/test", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
openat(AT_FDCWD, "/test", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3
getdents64(3, /* 3 entries */, 32768)   = 72
lstat64("/test/this", 0x95aba40)        = -1 ENOENT (No such file or directory)

This is the problem I think, it should not attempt a mount and should
return 0.

I've only looked at the current upstream source so far and I can't see
why this is happening.

I've built upstream 3.5.4 so I can look at the actual source of the
kernel I'm using to test.

Ian


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux