On Sat, Feb 20, 2016 at 02:25:40PM +0100, Mickaël Salaün wrote: > I think the bug may be somewhere in the nd->depth handling (when its value is 0) in fs/namei.c:get_link(): struct saved *last = nd->stack + nd->depth - 1 Getting there with nd->depth == 0 would certainly be a bug - it would mean that we got there without should_follow_link() having returned 1. In case of open() it would be "do_last() has returned positive without should_follow_link() having returned 1". <looks> OK, there are several places where we rely on not getting bogus return values - inode_permission() should not return positives, neither should vfs_open(), security_path_truncate() and notify_change(). Other similar "handle the last component" functions are guaranteed to never return positives other than directly from should_follow_link(), so they are OK. IIRC, you used LSM to inject a positive value to inode_permission(), right? Another way to trigger that would've been ->open() returning positive - a bug on *anything* since ->open() had been introduced in 0.95. Amount of harm would vary - e.g. 0.95 would simply have that positive number returned to userland, looking like successful open(2). With no new descriptor, of course... Short-term we probably want just if (unlikely(error > 0)) { WARN_ON(1); error = -EINVAL; } added right after out: in do_last(), try to trigger Dmitry's reproducers on it and then work back to the source of that thing *if* that's what's happening in his case. Yours almost certainly is just that. Longer-term... I'm not sure. Having a method that is supposed to return 0 or -E<something> actually return positive is going to be a bad thing, no matter what, but "that bogus value gets passed to userland" is a lot more tolerable than "kernel memory corruption". do_last() calling conventions make it vulnerable to the latter, and as far as nd->stack underruns that's it, but I'm not sure we don't have other places where such bug in driver, etc. would translate into mess ;-/ OK, in any case, let's start with checking if Dmitry is seeing that and not something else. I still don't understand his stack traces - the fault address quoted in his first posting doesn't match the register values in the same trace, and there's also a possibility that it's an RCU-related crap. This should give a warning and prevent an oops if we are hitting a stack underrun on bogus positive from do_last(). Dmitry, could you try to build with delta below and run your reproducer(s)? diff --git a/fs/namei.c b/fs/namei.c index f624d13..e30deef 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3273,6 +3273,10 @@ opened: goto exit_fput; } out: + if (unlikely(error > 0)) { + WARN_ON(1); + error = -EINVAL; + } if (got_write) mnt_drop_write(nd->path.mnt); path_put(&save_parent); -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html