On Thu, Oct 7, 2010 at 19:49, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
   ÂI've spent quite a while hunting that crap down; reverting VFS fix mentioned in original thread *does* get rid of the symptoms, but so does the patch below.    ÂWhat happens is this: if ->follow_link() (usually something like stat("/proc/2/fd", ...) done by pidof(8)) return ERR_PTR(-....), we return to __do_follow_link() and do the following:    Â*p = dentry->d_inode->i_op->follow_link(dentry, nd);    Âerror = PTR_ERR(*p);    Âif (!IS_ERR(*p)) {        Âchar *s = nd_get_link(nd);        Âerror = 0;        Âif (s)            Âerror = __vfs_follow_link(nd, s);        Âelse if (nd->last_type == LAST_BIND) {            Âerror = force_reval_path(&nd->path, nd);            Âif (error)                Âpath_put(&nd->path);        Â}    Â}    Âreturn error; We _should_ return non-zero value; IS_ERR(ERR_PTR(-n)) is 1 and PTR_ERR(ERR_PTR(n)) is -n. ÂWhat happens instead is that this thing actually returns 0. ÂAnd no, it's not a miscompile. ÂPatch below removes the symptoms of the bug, but only if both parts are present. I.e. *not* doing "report = 1" in proc_pid_follow_link() gives us visible breakage, despite the fact that report is initialized as 1 and nothing except proc_pid_follow_link() ever tries to assign anything to it. ÂSeeing that fs/namei.c and fs/proc/base.c are compiled separately, we can exclude gcc problems. The cheapest way to reproduce is to boot with init=/bin/sh, then mount /proc and have stat("/proc/2/exe", &st) called; if stat() returns 0, we are fscked. ÂThe critical part is between return from proc_exe_link() (we'll leave it via if (!mm) return -ENOENT;) to return from __do_follow_link() -> do_follow_link() -> link_path_walk().
I booted 2.6.36-rc7-atari-00360-g0dd2e6a (my current private test kernel) with init=/bin/sh, mounted /proc, and tried for i in $(seq 1000); do stat /proc/2/exe; done a few times, but I didn't see any ida_remove messages. It cannot read the /proc/2/exe symlink, though. This is on aranym-0.9.9-1 from Ubuntu/amd64. Gr{oetje,eeting}s, Â Â Â Â Â Â Â Â Â Â Â Â Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. Â Â Â Â Â Â Â Â Â Â Â Â Â ÂÂ ÂÂ -- Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe linux-m68k" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html