Re: [PATCH] Re: Possible FS race condition between iterate_dir and d_alloc_parallel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 24, 2019 at 09:55:06AM -0700, Linus Torvalds wrote:
> [ Sorry for html, I'm driving around ]
> 
> On Mon, Sep 23, 2019, 19:52 Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> 
> >
> > Argh...  The things turned interesting.  The tricky part is
> > where do we handle switching cursors away from something
> > that gets moved.
> >
> 
> I forget why we do that. Remind me?
> 
> Anyway, I think to a first approximation, we should probably first just see
> if the "remove cursors from the end" change already effectively makes most
> of the regression go away.
> 
> Because with that change, if you just readdir() things with a sufficiently
> big buffer, you'll never have the cursor hang around over several system
> calls.
> 
> Before, you'd do a readdir, add the cursor, return to user space, then do
> another readdir just to see the EOF, and then do the close().
> 
> So it used to leave the cursor in place over those two final system calls,
> and now it's all gone.
> 
> I'm sure that on the big systems you can still trigger the whole d_lock
> contention on the parent, but I wonder if it's all that much of a problem
> any more in practice with your other change..

If nothing else, it's DoSable right now.  And getting rid of that would
reduce the memory footprint, while we are at it.

In any case, it looks like btrfs really wants an empty directory there,
i.e. the right thing to do would be simple_lookup() for ->lookup.

CIFS is potentially trickier.  AFAICS, what's going on is
	* Windows has a strange export, called IPC$.  Looks like it
was (still is?) the place where they export their named pipes.  From
what I'd been able to figure out, it's always there and allows for
some other uses - it can be used to get the list of exports.  Whether
the actual named pipes are seen there these days... no idea.
	* there seems to be nothing to prevent a server (modified
samba, for example) from exporting whatever it wants under that
name.
	* IF it can be non-empty, mounting it will end up with
root directory where we can do lookups for whatever is there.
getdents() on that root will show what's currently in dcache
(== had been looked up and still has not been evicted by
memory pressure).  Mainline can get seriously buggered if
dcache_readdir() steps into something that is going away.  With the
patches in this series that's no longer a problem.  HOWEVER, if
lookup in one subdirectory picks an alias for another subdirectory
of root, we will be really screwed - shared lock on parent
won't stop d_splice_alias() from moving the alias, and that can
bloody well lead to freeing the original name.  From under
copy_to_user()...  And grabbing a reference to dentry obviously
doesn't prevent that - dentry itself won't get freed, but
external name very well might be.

Again, I don't know CIFS well enough to tell how IPC$ is really
used.  If it doesn't normally contain anything but named pipes,
we can simply have cifs_lookup() refuse to return subdirectories
on such mounts, with solves any problems with d_splice_alias().
If it's always empty - better yet.  If the impressions above
(and that's all those are) are correct, we might have a problem
in mainline with bogus/malicious servers.

That's independent from what we do with the cursors.  I asked
around a bit; no luck so far.  It would be really nice if
some CIFS folks could give a braindump on that thing...



[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux