On Fri, 2013-10-11 at 07:29 -0600, David Ahern wrote: > On 10/11/13 3:55 AM, Ian Kent wrote: > > On Fri, 2013-10-11 at 10:06 +0800, Ian Kent wrote: > >> On Thu, 2013-10-10 at 17:22 -0600, David Ahern wrote: > >>> Running 3.12-rc3 just hit BUG in autofs4_expire_wait > >> > >> It doesn't look like this could be due to Al's change to the locking in > >> autos4_wait() and that the only change to autofs that I'm aware of. > >> > >> Could you do a bisect please? > > > > Of course that assumes it's repeatable. > > Is it? > > > > Can you provide any information about the environment and activity that > > was happening at the time of the BUG()? > > The system was up and running for 9 days before hitting the BUG. After > that with 3 cpus on softlockup I had to do a reboot (forced). After the > reboot I continued the workload again without a repeat incident (yet), > so I am not sure bisect is going to be possible. Yeah, it isn't repeatable. > > This is a corporate environment where practically everything is in an > automount. Specific to this problem I was repeatedly building a > workspace in one window, using cscope in another and checking code > against a different workspace in a third -- all 3 of those were > different automounts and different NAS servers. > > From objdump on vmlinux the line in question is fs/autofs4/expire.c:465 > > if (ino->flags & AUTOFS_INF_EXPIRING) { Right, there haven't been changes to the autofs kernel code that affect the reference counting of dentrys so I have to conclude this is being caused by other changes. When walking an autofs path, the walk should always be put into refwalk mode, so the function containing this line should always have a dentry with a reference held. Which just means that the autofs info struct (ino here) won't be invalid. Now ->d_release() (which frees ino) is only called after the dentry reference count falls to zero and the dentry is going away. We can't check ino for NULL here because the dentry pointer to it isn't set to NULL when it's freed in ->d_release(). Setting the dentry field to NULL is futile because the next thing the VFS does is to free the dentry itself. Well, it calls RCU to schedule the free anyway. The fact that ->d_release() has been called makes me think there's a reference counting problem somewhere in the VFS. Al, is my thinking correct here? There were some significant changes to this area of the VFS in 3.11 by the look of it. So more history please, had you used 3.11 for an extended amount of time, before using the 3.12-rc? IOW what's your kernel version use history please? Ian -- To unsubscribe from this list: send the line "unsubscribe autofs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html