Re: Unable to re-mount after timed out successful umount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2012-11-23 at 15:13 -0500, Aaron Knister wrote:
> On 11/23/12 11:46 AM, Aaron Knister wrote:
> > On 11/23/12 11:12 AM, Aaron Knister wrote:
> >> On 11/21/12 10:20 PM, Ian Kent wrote:
> >>> On Wed, 2012-11-21 at 20:07 +0000, Knister, Aaron (NIH/NLM/NCBI) [C]
> >>> wrote:
> >>>> Hi,
> >>>>
> >>>> I'm currently experiencing an issue on a large (several hundred) 
> >>>> number
> >>>> of systems by which autofs mounts frequently become stale and don't
> >>>> trigger a mount. The automounts in question are NFS mounts. Based on
> >>>> the logs captured on a system that had debugging enabled umount
> >>>> appeared to timeout but did at some point succeed as evidenced by the
> >>>> fact that the mount in question isn't mounted after this issue occurs.
> >>>> (Why umount is taking so darned long is the subject of another
> >>>> investigation underway).
> >>>>
> >>>> I created what I believe to be a similar test situation in which the
> >>>> umount command doesn't complete within the autofs umount timeout value
> >>>> but does eventually succeed. Here are logs from autofs with debugging
> >>>> enabled on the test system, and the mount in question isn
> >>>> /net/nabl000/vol/blast:
> >>>>
> >>>> Nov 21 10:13:58 asktest1 automount[27418]: expiring path 
> >>>> /net/nabl000/vol/blast
> >>>> Nov 21 10:13:58 asktest1 automount[27418]: handle_packet: type = 6
> >>>> Nov 21 10:13:58 asktest1 automount[27418]: 
> >>>> handle_packet_expire_direct: token 4829, name /net/nabl000/vol/blast
> >>>> Nov 21 10:13:58 asktest1 automount[27418]: st_expire: state 1 path 
> >>>> /net
> >>>> Nov 21 10:13:58 asktest1 automount[27418]: umount_multi: path 
> >>>> /net/nabl000/vol/blast incl 1
> >>>> Nov 21 10:13:58 asktest1 automount[27418]: unmounting dir = 
> >>>> /net/nabl000/vol/blast
> >>>> Nov 21 10:14:10 asktest1 automount[27418]: could not umount dir 
> >>>> /net/nabl000/vol/blast
> >>>> Nov 21 10:14:10 asktest1 automount[27418]: couldn't complete expire 
> >>>> of /net/nabl000/vol/blast
> >>>> Nov 21 10:14:10 asktest1 automount[27418]: dev_ioctl_send_fail: 
> >>>> token = 4829
> >>>> Nov 21 10:14:10 asktest1 automount[27418]: expiring path 
> >>>> /net/nabl000/vol/blast
> >>>> Nov 21 10:14:10 asktest1 automount[27418]: handle_packet: type = 6
> >>>> Nov 21 10:14:10 asktest1 automount[27418]: 
> >>>> handle_packet_expire_direct: token 4830, name /net/nabl000/vol/blast
> >>>> Nov 21 10:14:10 asktest1 automount[27418]: umount_multi: path 
> >>>> /net/nabl000/vol/blast incl 1
> >>>> Nov 21 10:14:10 asktest1 automount[27418]: unmounting dir = 
> >>>> /net/nabl000/vol/blast
> >>>> Nov 21 10:14:22 asktest1 automount[27418]: could not umount dir 
> >>>> /net/nabl000/vol/blast
> >>>> Nov 21 10:14:22 asktest1 automount[27418]: couldn't complete expire 
> >>>> of /net/nabl000/vol/blast
> >>>> Nov 21 10:14:22 asktest1 automount[27418]: dev_ioctl_send_fail: 
> >>>> token = 4830
> >>>> Nov 21 10:14:22 asktest1 automount[27418]: expiring path 
> >>>> /net/nabl000/vol/blast
> >>>> Nov 21 10:14:22 asktest1 automount[27418]: handle_packet: type = 6
> >>>> Nov 21 10:14:22 asktest1 automount[27418]: 
> >>>> handle_packet_expire_direct: token 4831, name /net/nabl000/vol/blast
> >>>> Nov 21 10:14:22 asktest1 automount[27418]: umount_multi: path 
> >>>> /net/nabl000/vol/blast incl 1
> >>>> Nov 21 10:14:22 asktest1 automount[27418]: unmounting dir = 
> >>>> /net/nabl000/vol/blast
> >>>> Nov 21 10:14:34 asktest1 automount[27418]: could not umount dir 
> >>>> /net/nabl000/vol/blast
> >>>> Nov 21 10:14:34 asktest1 automount[27418]: couldn't complete expire 
> >>>> of /net/nabl000/vol/blast
> >>>> Nov 21 10:14:34 asktest1 automount[27418]: dev_ioctl_send_fail: 
> >>>> token = 4831
> >>>>
> >>>> Once this occurs the autofs mount at /net/nabl000/vol/blast/ will not
> >>>> trigger a remount. Examining /net/nabl000/vol/blast using systemtap
> >>>> reveals that the dentry at /net/nabl000/vol/blast/ does not have the
> >>>> DMANAGED_AUTOMOUNT flag set on its dentry->d_mounted object the
> >>>> presence of which I believe is what causes follow_automount() in
> >>>> fs/namei.c to trigger an automount. If I set this flag on the 
> >>>> d_mounted
> >>>> object of the dentry (again, using stap) then follow_automount is
> >>>> called, however in autofs4_d_automount the mount is not processed
> >>>> because the d_subdirs object of the dentry in question doesn't appear
> >>>> empty. What's interesting, however, is that if I kill -9 automount and
> >>>> repeat the process of setting the DMANAGED_AUTOMOUNT flag then
> >>>> list_empty(d_subdirs) evaluates to true. I can't quite figure this out
> >>>> (any insight would be appreciated). I suspect it has something to do
> >>>> with automount having /net/nabl000/vol/blast open().
> >>> Presumably this is an autofs hosts map?
> >>>
> >>> I suspect I'm working on this at ATM.
> >>>
> >>> I don't know what kernel you are running either and, in any case, I've
> >>> not finished the patches.
> >>>
> >>> I believe the problem occurs because when the control file handle is
> >>> opened on the autofs fs dentry it implicitly crates a hidden dentry
> >>> within the mount point and discards it when the descriptor is closed.
> >>>
> >>> This is a side effect of the mount wait I hadn't considered and I think
> >>> there may be more to fixing this than the patch below but this behavior
> >>> might just look to autofs like a manual umount of the offset which is
> >>> fixed by the patch.
> >>>
> >>> You could use this patch as an interim workaround  but it will not be
> >>> what is committed upstream, as you can see by the following discussion
> >>> in the thread.
> >>>
> >>> https://lkml.org/lkml/2012/11/15/682
> >>>
> >>> But that assumes you can apply it to your kernel version, which it
> >>> probably won't since DMANAGED_AUTOMOUNT is not used in current upstream
> >>> kernels. It was used in the original development. Neither is 
> >>> ->d_mounted
> >>> used for the flags, so it looks like you have a pre-scalability series
> >>> applied to your kernel.
> >>>
> >>> Tell me about the kernel your using.
> >>>
> >>> Ian
> >>>
> >>>
> >> Hi Ian,
> >>
> >> Thank you for your quick reply!
> >>
> >> Out of curiosity, why does the control file handle implicitly create 
> >> a hidden dentry? I've been trying to determine how/why this is so. 
> >> I've been digging through the autofs4/kernel vfs code trying to find 
> >> the answer.

It's to keep track of directory traversals.

If the directory is large you might not get all entries on the first
directory read so the position needs to be remembered. There are a
number of simple fs functions in fs/libfs.c that can be used by file
systems, in particular dcache_dir_open() and dcache_dir_close() which
autofs uses. They create the dentry which is negative and unhashed so,
as far as the VFS is concerned, it's not there but then using
list_empty() doesn't always work. 

> >>
> >> I'm using the RHEL 2.6.18-308.8.1.el5 x86_64 kernel. I could perhaps 

Right, I'm the one who did the back port (actually horizontal port of
the original work) so I'm quite familiar with that kernel.

> >> work the patch to apply cleanly to this kernel but as you eluded to I 
> >> don't believe this will fix the problem in its entirety for me since 
> >> the DMANAGED_AUTOMOUNT flag will still not be set which in the kernel 
> >> version I'm using appears to be what results in follow_automount() 
> >> being called. Doing a cursory glance at the latest autofs4 code 
> >> suggests to me that the same situation may arise albeit with 
> >> different code. I wonder if there's a way to systematically trigger 
> >> the DMANAGED_AUTOMOUNT flag (or its equivalent) being set on the 
> >> autofs mount's dentry upon unmount of the filesystem sitting atop it. 
> >> Currently I believe that flag is set by the autofs4 code on a 
> >> successful mount expiration, but if autofs is unaware of the umount I 
> >> imagine it would be improbable for it to set said flag.
> >>
> > On second thoughts perhaps autofs4_d_manage could examine d_mounted 
> > and determine if it the dentry is mounted. If not mounted it would 
> > somehow determine if the DMANAGED_AUTOMOUNT flag should be set and 
> > then set it. I imagine this is an oversimplification, though. I also 
> > realize that these suggestions are based on the older code base.
> >> Best,
> >> Aaron
> >> -- 
> >> To unsubscribe from this list: send the line "unsubscribe autofs" in
> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> > -- 
> > To unsubscribe from this list: send the line "unsubscribe autofs" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Hi Ian,
> 
> Please disregard my previous comments about automatically re-setting the 
> DMANAGED_AUTOMOUNT flag. It was my impression that this flag was unset 
> after each offset mount and re-set after each unmount. I know realize 
> that this isn't necessarily the case.

But there is something else about setting and clearing the flag.

To set and clear the flag opens autofs up to having a dentry without the
flag set when it should be. Although the specific dentrys this is done
for don't actually have a real mount, they are the base of a multi-mount
that has no actual root mount. More importantly it involves a seriously
ugly logical check in the expire code so playing with that flag will go
away. I already have patches for it and they appear to be working but
I'm not totally sure yet.

> 
> I took the patch you proposed and back-ported it to the RHEL5 kernel and 
> it appears to resolve the issue! (With the patch applied I have, 
> however, noticed left-behind entries in /etc/mtab but I suspect that's a 
> user-space problem).

Right, that's been a problem for a long time and attempts to resolve it
don't seem to have worked or have stopped working down the track. I've
done a lot to make autofs independent of that and I don't think it's a
problem for autofs although it is annoying for users.

> 
> The patch is pasted below-- could you give me some feedback on its 
> suitability? If you think it looks good I'll open a bug in RedHat's 
> bugzilla.

There's already a bugzilla but, unfortunately, it's private.
I've already done a patch, pretty much what you have below, and I'm
waiting for development approval (basically bugzilla bug acks).

Ian


--
To unsubscribe from this list: send the line "unsubscribe autofs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux Ext4]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux