On Wed, 2012-11-21 at 20:07 +0000, Knister, Aaron (NIH/NLM/NCBI) [C] wrote: > Hi, > > I'm currently experiencing an issue on a large (several hundred) number > of systems by which autofs mounts frequently become stale and don't > trigger a mount. The automounts in question are NFS mounts. Based on > the logs captured on a system that had debugging enabled umount > appeared to timeout but did at some point succeed as evidenced by the > fact that the mount in question isn't mounted after this issue occurs. > (Why umount is taking so darned long is the subject of another > investigation underway). > > I created what I believe to be a similar test situation in which the > umount command doesn't complete within the autofs umount timeout value > but does eventually succeed. Here are logs from autofs with debugging > enabled on the test system, and the mount in question isn > /net/nabl000/vol/blast: > > Nov 21 10:13:58 asktest1 automount[27418]: expiring path /net/nabl000/vol/blast > Nov 21 10:13:58 asktest1 automount[27418]: handle_packet: type = 6 > Nov 21 10:13:58 asktest1 automount[27418]: handle_packet_expire_direct: token 4829, name /net/nabl000/vol/blast > Nov 21 10:13:58 asktest1 automount[27418]: st_expire: state 1 path /net > Nov 21 10:13:58 asktest1 automount[27418]: umount_multi: path /net/nabl000/vol/blast incl 1 > Nov 21 10:13:58 asktest1 automount[27418]: unmounting dir = /net/nabl000/vol/blast > Nov 21 10:14:10 asktest1 automount[27418]: could not umount dir /net/nabl000/vol/blast > Nov 21 10:14:10 asktest1 automount[27418]: couldn't complete expire of /net/nabl000/vol/blast > Nov 21 10:14:10 asktest1 automount[27418]: dev_ioctl_send_fail: token = 4829 > Nov 21 10:14:10 asktest1 automount[27418]: expiring path /net/nabl000/vol/blast > Nov 21 10:14:10 asktest1 automount[27418]: handle_packet: type = 6 > Nov 21 10:14:10 asktest1 automount[27418]: handle_packet_expire_direct: token 4830, name /net/nabl000/vol/blast > Nov 21 10:14:10 asktest1 automount[27418]: umount_multi: path /net/nabl000/vol/blast incl 1 > Nov 21 10:14:10 asktest1 automount[27418]: unmounting dir = /net/nabl000/vol/blast > Nov 21 10:14:22 asktest1 automount[27418]: could not umount dir /net/nabl000/vol/blast > Nov 21 10:14:22 asktest1 automount[27418]: couldn't complete expire of /net/nabl000/vol/blast > Nov 21 10:14:22 asktest1 automount[27418]: dev_ioctl_send_fail: token = 4830 > Nov 21 10:14:22 asktest1 automount[27418]: expiring path /net/nabl000/vol/blast > Nov 21 10:14:22 asktest1 automount[27418]: handle_packet: type = 6 > Nov 21 10:14:22 asktest1 automount[27418]: handle_packet_expire_direct: token 4831, name /net/nabl000/vol/blast > Nov 21 10:14:22 asktest1 automount[27418]: umount_multi: path /net/nabl000/vol/blast incl 1 > Nov 21 10:14:22 asktest1 automount[27418]: unmounting dir = /net/nabl000/vol/blast > Nov 21 10:14:34 asktest1 automount[27418]: could not umount dir /net/nabl000/vol/blast > Nov 21 10:14:34 asktest1 automount[27418]: couldn't complete expire of /net/nabl000/vol/blast > Nov 21 10:14:34 asktest1 automount[27418]: dev_ioctl_send_fail: token = 4831 > > Once this occurs the autofs mount at /net/nabl000/vol/blast/ will not > trigger a remount. Examining /net/nabl000/vol/blast using systemtap > reveals that the dentry at /net/nabl000/vol/blast/ does not have the > DMANAGED_AUTOMOUNT flag set on its dentry->d_mounted object the > presence of which I believe is what causes follow_automount() in > fs/namei.c to trigger an automount. If I set this flag on the d_mounted > object of the dentry (again, using stap) then follow_automount is > called, however in autofs4_d_automount the mount is not processed > because the d_subdirs object of the dentry in question doesn't appear > empty. What's interesting, however, is that if I kill -9 automount and > repeat the process of setting the DMANAGED_AUTOMOUNT flag then > list_empty(d_subdirs) evaluates to true. I can't quite figure this out > (any insight would be appreciated). I suspect it has something to do > with automount having /net/nabl000/vol/blast open(). Presumably this is an autofs hosts map? I suspect I'm working on this at ATM. I don't know what kernel you are running either and, in any case, I've not finished the patches. I believe the problem occurs because when the control file handle is opened on the autofs fs dentry it implicitly crates a hidden dentry within the mount point and discards it when the descriptor is closed. This is a side effect of the mount wait I hadn't considered and I think there may be more to fixing this than the patch below but this behavior might just look to autofs like a manual umount of the offset which is fixed by the patch. You could use this patch as an interim workaround but it will not be what is committed upstream, as you can see by the following discussion in the thread. https://lkml.org/lkml/2012/11/15/682 But that assumes you can apply it to your kernel version, which it probably won't since DMANAGED_AUTOMOUNT is not used in current upstream kernels. It was used in the original development. Neither is ->d_mounted used for the flags, so it looks like you have a pre-scalability series applied to your kernel. Tell me about the kernel your using. Ian -- To unsubscribe from this list: send the line "unsubscribe autofs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html