On Tue, Feb 23, 2016 at 04:34:59PM +0100, Dmitry Vyukov wrote: > The crash: > > [ 8095.048336] ------------[ cut here ]------------ > [ 8095.048864] WARNING: CPU: 3 PID: 5532 at fs/namei.c:1672 > should_follow_link.part.25+0x55/0x21a() NULL or ERR_PTR() passed as inode to should_follow_link(). > [ 8095.060526] BUG: unable to handle kernel NULL pointer dereference > at 000000000000000c OK, NULL inode it is. And that was in do_last(). > And here is my inode.o: > https://gist.githubusercontent.com/dvyukov/27ec88c2c1a83c2e0f38/raw/2514d0ddd7720a978e6a2f67c2dcb391046ce0e7/gistfile1.txt > > This can be reproduced following the instructions here: > https://github.com/google/syzkaller/wiki/How-to-execute-syzkaller-programs > Using this command line: > # ./syz-execprog -cover=0 -procs=60 -repeat=0 prog > with the following program: > https://gist.githubusercontent.com/dvyukov/fc026f36f9f76d1a440b/raw/0e133afa99eb7de45880523fbd48256cd2ae4a6c/gistfile1.txt > (requires CONFIG_USER_NS=y). The crash triggers after hours of execution. Joy... Another interesting question is whether we'd been in RCU mode at the time of that should_follow_link(). The thing is, we could've come there either from if (!(open_flag & O_CREAT)) { if (nd->last.name[nd->last.len]) nd->flags |= LOOKUP_FOLLOW | LOOKUP_DIRECTORY; /* we _can_ be in RCU mode here */ error = lookup_fast(nd, &path, &inode, &seq); if (likely(!error)) goto finish_lookup; or from BUG_ON(nd->flags & LOOKUP_RCU); inode = d_backing_inode(path.dentry); seq = 0; /* out of RCU mode, so the value doesn't matter */ if (unlikely(d_is_negative(path.dentry))) { path_to_nameidata(&path, nd); return -ENOENT; } finish_lookup: In the latter case we are holding a reference to path.dentry, so d_is_negative ought to be reliable and refering to the same backing inode. In the former, if we leave still in RCU mode, we went through *inode = d_backing_inode(dentry); negative = d_is_negative(dentry); [check that dentry->d_seq is still unchanged] ... [check that negative is false] and that guarantees that both inode and negative had been taken while dentry remained stable, so we couldn't pass through the second check with NULL inode. And returning 0 in non-RCU mode means that we go through if (unlikely(d_is_negative(dentry))) { dput(dentry); return -ENOENT; } path->mnt = mnt; path->dentry = dentry; err = follow_managed(path, nd); if (likely(!err)) *inode = d_backing_inode(path->dentry); return err; with dentry pinned, so NULL inode here is also bloody odd - we have positive dentry that will remain positive through all that and somehow follow_managed() (in non-RCU mode) gets us a negative one. Now, follow_managed() either leaves path->dentry unchanged (and keeps it pinned through all of that), or does struct vfsmount *mounted = lookup_mnt(path); if (mounted) { dput(path->dentry); if (need_mntput) mntput(path->mnt); path->mnt = mounted; path->dentry = dget(mounted->mnt_root); (and ->mnt_root should never be negative), or goes into follow_autmount(), where we either leave the damn thing unchanged or hit path->dentry = dget(mnt->mnt_root); ... or we have ->d_automount() instance doing something nasty to it. Damn. OK, a look through the instances shows that only autofs4 one might modify path->dentry: struct dentry *new = d_lookup(parent, &dentry->d_name); if (!new) return NULL; ino = autofs4_dentry_ino(new); ino->last_used = jiffies; dput(path->dentry); path->dentry = new; in autofs4_mountpoint_changed()... I doubt that this is the cause here, but let's slap WARN_ON(d_is_negative(new)) there. The thing is, I *do* see one bug around should_follow_link(), but it would manifest differently. So you must be hitting something else there, to get that NULL inode... Could you try to reproduce it with the patch below and see what warnings trigger? I'll try to reproduce it as well, but since you already have a working setup... diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index c6d7d3d..86f81e3 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -323,6 +323,7 @@ static struct dentry *autofs4_mountpoint_changed(struct path *path) struct dentry *new = d_lookup(parent, &dentry->d_name); if (!new) return NULL; + WARN_ON(d_is_negative(new)); ino = autofs4_dentry_ino(new); ino->last_used = jiffies; dput(path->dentry); diff --git a/fs/namei.c b/fs/namei.c index f624d13..ac00bcb 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1209,6 +1209,7 @@ static int follow_managed(struct path *path, struct nameidata *nd) /* Handle an automount point */ if (managed & DCACHE_NEED_AUTOMOUNT) { ret = follow_automount(path, nd, &need_mntput); + WARN_ON(d_is_negative(path->dentry)); if (ret < 0) break; continue; @@ -1613,8 +1614,10 @@ unlazy: path->mnt = mnt; path->dentry = dentry; err = follow_managed(path, nd); - if (likely(!err)) + if (likely(!err)) { *inode = d_backing_inode(path->dentry); + WARN_ON(!inode); + } return err; need_lookup: @@ -1712,6 +1715,17 @@ static inline int should_follow_link(struct nameidata *nd, struct path *link, return 0; if (!follow) return 0; + /* make sure that d_is_symlink above matches inode */ + if (nd->flags & LOOKUP_RCU) { + if (read_seqcount_retry(&link->dentry->d_seq, seq)) { + WARN_ON(1); // just as way to report hitting + // that path; it's OK to get + // here, in the final variant + // WARN_ON will disappear. + return -ECHILD; + } + } + WARN_ON(!inode); // now, _that_ should not happen. return pick_link(nd, link, inode, seq); } @@ -3273,6 +3287,10 @@ opened: goto exit_fput; } out: + if (unlikely(error > 0)) { + WARN_ON(1); + error = -EINVAL; + } if (got_write) mnt_drop_write(nd->path.mnt); path_put(&save_parent); -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html