Re: When ghosting is enabled, failed mounts by AutoFS leave broken directories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 26, 2012 at 6:15 AM, Ian Kent <raven@xxxxxxxxxx> wrote:
> On Tue, 2012-09-25 at 18:42 -0300, Leonardo Chiquitto wrote:
>> On Mon, Sep 24, 2012 at 11:13 PM, Ian Kent <raven@xxxxxxxxxx> wrote:
>> > On Mon, 2012-09-24 at 17:14 -0300, Leonardo Chiquitto wrote:
>> >> Hi,
>> >>
>> >> If you try to access an automounted directory and the mount operation fails,
>> >> the attributes of the directory (owner, group, mode, access time) are lost.
>> >> This happens only when AutoFS' browse mode (ghosting) is enabled.
>> >>
>> >> How to reproduce the problem:
>> >>
>> >> Make sure you have BROWSE_MODE="yes" in /etc/sysconfig/autofs.
>> >>
>> >> 1. Add a simple indirect mount map (it's required that the server does NOT
>> >>    export the /notexported directory):
>> >>
>> >> # cat /etc/auto.master
>> >> /nfs    /etc/auto.nfs
>> >>
>> >> # cat /etc/auto.nfs
>> >> v2      server:/notexported
>> >>
>> >> 2. Restart AutoFS and try to access the directory:
>> >>
>> >> # cd /nfs/v2
>> >> -bash: cd: /nfs/v2: No such file or directory
>> >>
>> >> # ls -la /nfs
>> >> ls: cannot access /nfs/v2: No such file or directory
>> >> total 4
>> >> drwxr-xr-x  3 root root    0 Sep 24 21:43 .
>> >> drwxr-xr-x 28 root root 4096 Sep 24 21:25 ..
>> >> d?????????  ? ?    ?       ?            ? v2
>> >>
>> >> This problem is reproducible on recent upstream kernels such as 3.6.0-rc6.
>> >> I found out that the last kernel version that does not exhibit the bug is
>> >> 2.6.37. Between 2.6.37 and 2.6.38 there were quite some changes in
>> >> AutoFS/VFS and my attempts to bisect the problem were frustrated by
>> >> early crashes.
>> >>
>> >> Any ideas?
>> >
>> > Not really.
>> >
>> > It would be best to start with an full autofs debug log so we can see
>> > what is happening within automount and work back from there. We'll
>> > probably need to look at the dentry itself as well to see if we can get
>> > any info about where it came from possibly, with a kernel dump.
>> >
>> > Ian
>>
>> Hi Ian,
>>
>> Here are the debug logs from automount and a snapshot of the dentry
>> and inode of the "broken" directory (running 2.6.38).
>>
>> I tried to compare it with 2.6.37 (which works), but the structures
>> changed a lot between these kernels.
>
> A great deal changed in 2.6.38 in the VFS.
>
> You really can't use 2.6.38 because there were a number of fixes that
> went in after this and I think the VFS code up to around 3.0 or 3.1
> could have had this problem anyway.

I'm using 3.6.0-rc6 here. I collected the data from 2.6.38 because this
is the first kernel that shows the bug and I thought it would be easier
to isolate the fix this way (smaller distance).

> What your looking for is kernels with fs/namei.c:follow_automount() with
> this near the top of the function, which shouldn't cause this type of
> problem:
>
>        /* We don't want to mount if someone's just doing a stat -
>          * unless they're stat'ing a directory and appended a '/' to
>          * the name.
>          *
>          * We do, however, want to mount if someone wants to open or
>          * create a file of any type under the mountpoint, wants to
>          * traverse through the mountpoint or wants to open the
>          * mounted directory.  Also, autofs may mark negative dentries
>          * as being automount points.  These will need the attentions
>          * of the daemon to instantiate them before they can be used.
>          */
>         if (!(flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY |
>                      LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) &&
>             path->dentry->d_inode)
>                 return -EISDIR;
>
> But I'm seeing this with 3.4.9-2.fc16 which does have this in
> follow_automount() so that's a puzzle.
>
>>
>> ## starting up
>>
>> Sep 25 13:44:30 n1 automount[2337]: Starting automounter version
>> 5.0.7, master map auto.master
>> Sep 25 13:44:30 n1 automount[2337]: using kernel protocol version 5.02
>> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_master: reading
>> master files auto.master
>> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init
>> gathered global options: (null)
>> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected,
>> passing -n to mount
>> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected,
>> passing -n to mount
>> Sep 25 13:44:30 n1 automount[2337]: lookup_read_master: lookup(file):
>> read entry /nfs
>> Sep 25 13:44:30 n1 automount[2337]: master_do_mount: mounting /nfs
>> Sep 25 13:44:30 n1 automount[2337]: automount_path_to_fifo: fifo name
>> /run/autofs.fifo-nfs
>> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_map: reading map
>> file /etc/auto.nfs
>> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init
>> gathered global options: (null)
>> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected,
>> passing -n to mount
>> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected,
>> passing -n to mount
>> Sep 25 13:44:30 n1 automount[2337]: mounted indirect on /nfs with
>> timeout 60, freq 15 seconds
>> Sep 25 13:44:30 n1 automount[2337]: st_ready: st_ready(): state = 0 path /nfs
>> Sep 25 13:44:30 n1 automount[2337]: ghosting enabled
>>
>> ## ls /nfs/v2
>>
>> Sep 25 13:44:44 n1 automount[2337]: handle_packet: type = 3
>> Sep 25 13:44:44 n1 automount[2337]: handle_packet_missing_indirect:
>> token 6, name v2, request pid 2352
>> Sep 25 13:44:44 n1 automount[2337]: attempting to mount entry /nfs/v2
>> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): looking up v2
>> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): v2 ->
>> server:/nfs
>> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): expanded
>> entry: server:/nfs
>> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): gathered options:
>> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun):
>> dequote("server:/nfs") -> server:/nfs
>> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): core of
>> entry: options=, loc=server:/nfs
>> Sep 25 13:44:44 n1 automount[2337]: sun_mount: parse(sun): mounting
>> root /nfs, mountpoint v2, what server:/nfs, fstype nfs, options (null)
>> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): root=/nfs
>> name=v2 what=server:/nfs, fstype=nfs, options=(null)
>> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host
>> server(10.1.1.4) proto tcp version 0x30
>> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.000214
>> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000114
>> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 164 weight 0
>> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host
>> server(10.1.1.4) proto udp version 0x30
>> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.001836
>> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000095
>> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 965 weight 0
>> Sep 25 13:44:44 n1 automount[2337]: prune_host_list: selected subset
>> of hosts that support NFS3 over TCP
>> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling
>> mkdir_path /nfs/v2
>> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling
>> mount -t nfs server:/nfs /nfs/v2
>> Sep 25 13:44:44 n1 automount[2337]: spawn_mount: mtab link detected,
>> passing -n to mount
>> Sep 25 13:44:44 n1 checkproc: checkproc: Empty pid file
>> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd
>> Sep 25 13:44:44 n1 start_daemon: start_daemon: Empty pid file
>> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd
>> Sep 25 13:44:44 n1 rpc.statd[2369]: Version 1.2.6 starting
>> Sep 25 13:44:44 n1 rpc.statd[2369]: Flags: TI-RPC
>> Sep 25 13:44:44 n1 automount[2337]: >> Starting rpc.statd .....done
>> Sep 25 13:44:44 n1 automount[2337]: >> mount.nfs: access denied by
>> server while mounting server:/nfs
>> Sep 25 13:44:44 n1 automount[2337]: mount(nfs): nfs: mount failure
>> server:/nfs on /nfs/v2
>> Sep 25 13:44:44 n1 automount[2337]: dev_ioctl_send_fail: token = 6
>> Sep 25 13:44:44 n1 automount[2337]: failed to mount /nfs/v2
>> Sep 25 13:44:54 n1 automount[2337]: st_expire: state 1 path /nfs
>> Sep 25 13:44:54 n1 automount[2337]: expire_proc: exp_proc =
>> 140484187907840 path /nfs
>> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: got thid
>> 140484187907840 path /nfs stat 0
>> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: sigchld: exp
>> 140484187907840 finished, switching from 2 to 1
>> Sep 25 13:44:54 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs
>> Sep 25 13:45:09 n1 automount[2337]: st_expire: state 1 path /nfs
>> Sep 25 13:45:09 n1 automount[2337]: expire_proc: exp_proc =
>> 140484187907840 path /nfs
>> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: got thid
>> 140484187907840 path /nfs stat 0
>> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: sigchld: exp
>> 140484187907840 finished, switching from 2 to 1
>> Sep 25 13:45:09 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs
>>
>> ## ls -la /nfs
>>
>> Sep 25 13:45:12 n1 automount[2337]: handle_packet: type = 3
>> Sep 25 13:45:12 n1 automount[2337]: handle_packet_missing_indirect:
>> token 7, name v2, request pid 2397
>> Sep 25 13:45:12 n1 automount[2337]: dev_ioctl_send_fail: token = 7
>
> This callback shouldn't happen.
>
>>
>> dentry and inode (/nfs/v2) from a v2.6.38 kernel, after triggering
>> the mount:
>
> This certainly looks like an autofs dentry.
>
>>
>> struct dentry {
>>   d_flags = 393216,
>>   d_seq = {
>>     sequence = 4
>>   },
>>   d_hash = {
>>     next = 0x0,
>>     pprev = 0xffffc900011f5680
>>   },
>
> This looks a bit odd but I don't think it is the cause of the problem.
>
> Other than that these look ok.
>
> snip ...
>
> I setup a simple map, auto.test, with a single key "this" to check this
> out and did an "strace -ff -o ls.trace /bin/ls --color=auto /test"
> against the 3.4.9-2.fc16 I currently have installed.
>
> Ignoring all the cruft, near the bottom we see:
> stat64("/test", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
> openat(AT_FDCWD, "/test", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3
> getdents64(3, /* 3 entries */, 32768)   = 72
> lstat64("/test/this", 0x95aba40)        = -1 ENOENT (No such file or directory)

This happens only after there was an attempt to mount /test/this
and the directory is already in the "incomplete" state, right?

> This is the problem I think, it should not attempt a mount and should
> return 0.

My theory about the root cause is that when the mount fails, the
kernel forgets to "reset" the state of that dentry. Makes any sense?

> I've only looked at the current upstream source so far and I can't see
> why this is happening.
>
> I've built upstream 3.5.4 so I can look at the actual source of the
> kernel I'm using to test.

Thanks for your help
Leonardo
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux Ext4]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux