Re: extra reference to fl->fl_file, possible regression

Jeff Layton <jlayton@xxxxxxxxxxxxxxx> · Fri, 10 Jul 2015 10:39:14 -0400

On Fri, 10 Jul 2015 14:54:44 +0200
William Dauchy <william@xxxxxxxxx> wrote:

> On Jul10 07:24, Jeff Layton wrote:
> > Huh. I'm stumped...
> > 
> > These patches are pretty straightforward. We're just taking an extra
> > reference to the filp when running lock operations so that it doesn't
> > disappear before the replies can be processed (typically in the event
> > that a signal comes in while waiting on the reply). Given the odd stack
> > trace above, I have to wonder if there's some sort of memory scribble
> > going on.
> 
> I also forgot to mention that I also had the following messgae before
> the trace:
> 
> VFS: Close: file count is 0
> 

Ok, that may be an important clue. From filp_close:

        if (!file_count(filp)) {
                printk(KERN_ERR "VFS: Close: file count is 0\n");
                return 0;
        }

...so looks like there could be a use-after free going on? Somehow
we're ending up with with an actual close being done after the last
reference has already been put. I'm not s

So, I suspect that the problem is with the second patch (the LOCKU
one).

I'm not sure if it's responsible for that message, but one of the
things we do in __fput() is call locks_remove_flock, which can dip down
into the NFS unlock codepath.

So if a file happened to have some flock locks on it, then we could
be taking a new reference to a file that has already had its refcount
go to zero.

I'll have to think about how best to deal with this as I totally missed
this when I did the original analysis of the bug. For now it's probably
best to revert that patch (though I think the one for the setlk is
likely OK).

Thanks,
-- 
Jeff Layton <jlayton@xxxxxxxxxxxxxxx>
Attachment:
pgpUUuBvEGi5u.pgp

Description: OpenPGP digital signature