Re: Unable to re-mount after timed out successful umount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2012-11-21 at 20:07 +0000, Knister, Aaron (NIH/NLM/NCBI) [C]
wrote:
> Hi,
> 
> I'm currently experiencing an issue on a large (several hundred) number
> of systems by which autofs mounts frequently become stale and don't
> trigger a mount. The automounts in question are NFS mounts. Based on
> the logs captured on a system that had debugging enabled umount
> appeared to timeout but did at some point succeed as evidenced by the
> fact that the mount in question isn't mounted after this issue occurs.
> (Why umount is taking so darned long is the subject of another
> investigation underway).
> 
> I created what I believe to be a similar test situation in which the
> umount command doesn't complete within the autofs umount timeout value
> but does eventually succeed. Here are logs from autofs with debugging
> enabled on the test system, and the mount in question isn
> /net/nabl000/vol/blast:
> 
> Nov 21 10:13:58 asktest1 automount[27418]: expiring path /net/nabl000/vol/blast
> Nov 21 10:13:58 asktest1 automount[27418]: handle_packet: type = 6
> Nov 21 10:13:58 asktest1 automount[27418]: handle_packet_expire_direct: token 4829, name /net/nabl000/vol/blast
> Nov 21 10:13:58 asktest1 automount[27418]: st_expire: state 1 path /net
> Nov 21 10:13:58 asktest1 automount[27418]: umount_multi: path /net/nabl000/vol/blast incl 1
> Nov 21 10:13:58 asktest1 automount[27418]: unmounting dir = /net/nabl000/vol/blast
> Nov 21 10:14:10 asktest1 automount[27418]: could not umount dir /net/nabl000/vol/blast
> Nov 21 10:14:10 asktest1 automount[27418]: couldn't complete expire of /net/nabl000/vol/blast
> Nov 21 10:14:10 asktest1 automount[27418]: dev_ioctl_send_fail: token = 4829
> Nov 21 10:14:10 asktest1 automount[27418]: expiring path /net/nabl000/vol/blast
> Nov 21 10:14:10 asktest1 automount[27418]: handle_packet: type = 6
> Nov 21 10:14:10 asktest1 automount[27418]: handle_packet_expire_direct: token 4830, name /net/nabl000/vol/blast
> Nov 21 10:14:10 asktest1 automount[27418]: umount_multi: path /net/nabl000/vol/blast incl 1
> Nov 21 10:14:10 asktest1 automount[27418]: unmounting dir = /net/nabl000/vol/blast
> Nov 21 10:14:22 asktest1 automount[27418]: could not umount dir /net/nabl000/vol/blast
> Nov 21 10:14:22 asktest1 automount[27418]: couldn't complete expire of /net/nabl000/vol/blast
> Nov 21 10:14:22 asktest1 automount[27418]: dev_ioctl_send_fail: token = 4830
> Nov 21 10:14:22 asktest1 automount[27418]: expiring path /net/nabl000/vol/blast
> Nov 21 10:14:22 asktest1 automount[27418]: handle_packet: type = 6
> Nov 21 10:14:22 asktest1 automount[27418]: handle_packet_expire_direct: token 4831, name /net/nabl000/vol/blast
> Nov 21 10:14:22 asktest1 automount[27418]: umount_multi: path /net/nabl000/vol/blast incl 1
> Nov 21 10:14:22 asktest1 automount[27418]: unmounting dir = /net/nabl000/vol/blast
> Nov 21 10:14:34 asktest1 automount[27418]: could not umount dir /net/nabl000/vol/blast
> Nov 21 10:14:34 asktest1 automount[27418]: couldn't complete expire of /net/nabl000/vol/blast
> Nov 21 10:14:34 asktest1 automount[27418]: dev_ioctl_send_fail: token = 4831
> 
> Once this occurs the autofs mount at /net/nabl000/vol/blast/ will not
> trigger a remount. Examining /net/nabl000/vol/blast using systemtap
> reveals that the dentry at /net/nabl000/vol/blast/ does not have the
> DMANAGED_AUTOMOUNT flag set on its dentry->d_mounted object the
> presence of which I believe is what causes follow_automount() in
> fs/namei.c to trigger an automount. If I set this flag on the d_mounted
> object of the dentry (again, using stap) then follow_automount is
> called, however in autofs4_d_automount the mount is not processed
> because the d_subdirs object of the dentry in question doesn't appear
> empty. What's interesting, however, is that if I kill -9 automount and
> repeat the process of setting the DMANAGED_AUTOMOUNT flag then
> list_empty(d_subdirs) evaluates to true. I can't quite figure this out
> (any insight would be appreciated). I suspect it has something to do
> with automount having /net/nabl000/vol/blast open().

Presumably this is an autofs hosts map?

I suspect I'm working on this at ATM.

I don't know what kernel you are running either and, in any case, I've
not finished the patches.

I believe the problem occurs because when the control file handle is
opened on the autofs fs dentry it implicitly crates a hidden dentry
within the mount point and discards it when the descriptor is closed.

This is a side effect of the mount wait I hadn't considered and I think
there may be more to fixing this than the patch below but this behavior
might just look to autofs like a manual umount of the offset which is
fixed by the patch.

You could use this patch as an interim workaround  but it will not be
what is committed upstream, as you can see by the following discussion
in the thread.

https://lkml.org/lkml/2012/11/15/682

But that assumes you can apply it to your kernel version, which it
probably won't since DMANAGED_AUTOMOUNT is not used in current upstream
kernels. It was used in the original development. Neither is ->d_mounted
used for the flags, so it looks like you have a pre-scalability series
applied to your kernel.

Tell me about the kernel your using.

Ian


--
To unsubscribe from this list: send the line "unsubscribe autofs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux Ext4]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux