Re: Unable to re-mount after timed out successful umount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/23/12 11:12 AM, Aaron Knister wrote:
On 11/21/12 10:20 PM, Ian Kent wrote:
On Wed, 2012-11-21 at 20:07 +0000, Knister, Aaron (NIH/NLM/NCBI) [C]
wrote:
Hi,

I'm currently experiencing an issue on a large (several hundred) number
of systems by which autofs mounts frequently become stale and don't
trigger a mount. The automounts in question are NFS mounts. Based on
the logs captured on a system that had debugging enabled umount
appeared to timeout but did at some point succeed as evidenced by the
fact that the mount in question isn't mounted after this issue occurs.
(Why umount is taking so darned long is the subject of another
investigation underway).

I created what I believe to be a similar test situation in which the
umount command doesn't complete within the autofs umount timeout value
but does eventually succeed. Here are logs from autofs with debugging
enabled on the test system, and the mount in question isn
/net/nabl000/vol/blast:

Nov 21 10:13:58 asktest1 automount[27418]: expiring path /net/nabl000/vol/blast
Nov 21 10:13:58 asktest1 automount[27418]: handle_packet: type = 6
Nov 21 10:13:58 asktest1 automount[27418]: handle_packet_expire_direct: token 4829, name /net/nabl000/vol/blast
Nov 21 10:13:58 asktest1 automount[27418]: st_expire: state 1 path /net
Nov 21 10:13:58 asktest1 automount[27418]: umount_multi: path /net/nabl000/vol/blast incl 1 Nov 21 10:13:58 asktest1 automount[27418]: unmounting dir = /net/nabl000/vol/blast Nov 21 10:14:10 asktest1 automount[27418]: could not umount dir /net/nabl000/vol/blast Nov 21 10:14:10 asktest1 automount[27418]: couldn't complete expire of /net/nabl000/vol/blast Nov 21 10:14:10 asktest1 automount[27418]: dev_ioctl_send_fail: token = 4829 Nov 21 10:14:10 asktest1 automount[27418]: expiring path /net/nabl000/vol/blast
Nov 21 10:14:10 asktest1 automount[27418]: handle_packet: type = 6
Nov 21 10:14:10 asktest1 automount[27418]: handle_packet_expire_direct: token 4830, name /net/nabl000/vol/blast Nov 21 10:14:10 asktest1 automount[27418]: umount_multi: path /net/nabl000/vol/blast incl 1 Nov 21 10:14:10 asktest1 automount[27418]: unmounting dir = /net/nabl000/vol/blast Nov 21 10:14:22 asktest1 automount[27418]: could not umount dir /net/nabl000/vol/blast Nov 21 10:14:22 asktest1 automount[27418]: couldn't complete expire of /net/nabl000/vol/blast Nov 21 10:14:22 asktest1 automount[27418]: dev_ioctl_send_fail: token = 4830 Nov 21 10:14:22 asktest1 automount[27418]: expiring path /net/nabl000/vol/blast
Nov 21 10:14:22 asktest1 automount[27418]: handle_packet: type = 6
Nov 21 10:14:22 asktest1 automount[27418]: handle_packet_expire_direct: token 4831, name /net/nabl000/vol/blast Nov 21 10:14:22 asktest1 automount[27418]: umount_multi: path /net/nabl000/vol/blast incl 1 Nov 21 10:14:22 asktest1 automount[27418]: unmounting dir = /net/nabl000/vol/blast Nov 21 10:14:34 asktest1 automount[27418]: could not umount dir /net/nabl000/vol/blast Nov 21 10:14:34 asktest1 automount[27418]: couldn't complete expire of /net/nabl000/vol/blast Nov 21 10:14:34 asktest1 automount[27418]: dev_ioctl_send_fail: token = 4831

Once this occurs the autofs mount at /net/nabl000/vol/blast/ will not
trigger a remount. Examining /net/nabl000/vol/blast using systemtap
reveals that the dentry at /net/nabl000/vol/blast/ does not have the
DMANAGED_AUTOMOUNT flag set on its dentry->d_mounted object the
presence of which I believe is what causes follow_automount() in
fs/namei.c to trigger an automount. If I set this flag on the d_mounted
object of the dentry (again, using stap) then follow_automount is
called, however in autofs4_d_automount the mount is not processed
because the d_subdirs object of the dentry in question doesn't appear
empty. What's interesting, however, is that if I kill -9 automount and
repeat the process of setting the DMANAGED_AUTOMOUNT flag then
list_empty(d_subdirs) evaluates to true. I can't quite figure this out
(any insight would be appreciated). I suspect it has something to do
with automount having /net/nabl000/vol/blast open().
Presumably this is an autofs hosts map?

I suspect I'm working on this at ATM.

I don't know what kernel you are running either and, in any case, I've
not finished the patches.

I believe the problem occurs because when the control file handle is
opened on the autofs fs dentry it implicitly crates a hidden dentry
within the mount point and discards it when the descriptor is closed.

This is a side effect of the mount wait I hadn't considered and I think
there may be more to fixing this than the patch below but this behavior
might just look to autofs like a manual umount of the offset which is
fixed by the patch.

You could use this patch as an interim workaround  but it will not be
what is committed upstream, as you can see by the following discussion
in the thread.

https://lkml.org/lkml/2012/11/15/682

But that assumes you can apply it to your kernel version, which it
probably won't since DMANAGED_AUTOMOUNT is not used in current upstream
kernels. It was used in the original development. Neither is ->d_mounted
used for the flags, so it looks like you have a pre-scalability series
applied to your kernel.

Tell me about the kernel your using.

Ian


Hi Ian,

Thank you for your quick reply!

Out of curiosity, why does the control file handle implicitly create a hidden dentry? I've been trying to determine how/why this is so. I've been digging through the autofs4/kernel vfs code trying to find the answer.

I'm using the RHEL 2.6.18-308.8.1.el5 x86_64 kernel. I could perhaps work the patch to apply cleanly to this kernel but as you eluded to I don't believe this will fix the problem in its entirety for me since the DMANAGED_AUTOMOUNT flag will still not be set which in the kernel version I'm using appears to be what results in follow_automount() being called. Doing a cursory glance at the latest autofs4 code suggests to me that the same situation may arise albeit with different code. I wonder if there's a way to systematically trigger the DMANAGED_AUTOMOUNT flag (or its equivalent) being set on the autofs mount's dentry upon unmount of the filesystem sitting atop it. Currently I believe that flag is set by the autofs4 code on a successful mount expiration, but if autofs is unaware of the umount I imagine it would be improbable for it to set said flag.

On second thoughts perhaps autofs4_d_manage could examine d_mounted and determine if it the dentry is mounted. If not mounted it would somehow determine if the DMANAGED_AUTOMOUNT flag should be set and then set it. I imagine this is an oversimplification, though. I also realize that these suggestions are based on the older code base.
Best,
Aaron
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe autofs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux Ext4]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux