On Fri, Nov 29, 2013 at 04:06:59AM +0000, Al Viro wrote: > On Fri, Nov 29, 2013 at 03:59:39AM +0000, Al Viro wrote: > > On Fri, Nov 29, 2013 at 02:41:21AM +0000, Al Viro wrote: > > > On Thu, Nov 28, 2013 at 06:07:27PM -0800, Linus Torvalds wrote: > > > > > > > HOWEVER. It's certainly *not* valid if "current->fs->root/pwd" points > > > > to it. So yeah, there must have been an extra dput() somewhere. Or, > > > > more likely, I think, we don't get the refcount to some dentry > > > > properly any more. > > > > > > > > I don't see where, though. You did change where "LOOKUP_RCU" is > > > > cleared in unlazy_walk() but you did add that > > > > > > > > nd->path.dentry = NULL; > > > > > > > > and that looks like it should be ok. And I don't see what else would care. > > > > > > *nod* > > > > > > BTW, vfsmount refcount is 12, so we *definitely* nowhere near the > > > final mntput(), etc. and mnt->mnt_root itself should also have > > > contributed. > > > > > > I'm going to try to find out _which_ test buggers the refcount - at > > > least that way I'll have something resembling a usable reproducer... > > > > OK, we have a winner. generic/234 drops refcount of root dentry by about > > 20 (and yes, I should've started with that one, what with Ted's report). > > Run it several times (4 should suffice nicely) and the damn thing triggers > > right there. Uff... At least that takes under a minute instead of a couple > > of hours, which makes debugging that shite much more tolerable... > > I think I see what's going on; it *is* unlazy_walk(), but not nd->path. > It's nd->root. IOW, the relevant fix to fs/namei.c is > > @@ -513,8 +513,7 @@ static int unlazy_walk(struct nameidata *nd, struct dentry *dentry) > > if (!lockref_get_not_dead(&parent->d_lockref)) { > nd->path.dentry = NULL; > - rcu_read_unlock(); > - return -ECHILD; > + goto out; > } > > /* > > (in addition to other pieces of fun found in process). I'll test and post > results in a few... And yes, it has fixed the problem with generic/234. I'll do full xfstests run to see if there's anything else, but this one is obviously needed. I'll send it with sane commit message (along with follow_dotdot_rcu() fix) later tonight. path_init() race is a separate story - that one should probably go separately, since we'll want it in all branches starting with early 2011 or so. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs