Re: [PATCH] Re: Possible FS race condition between iterate_dir and d_alloc_parallel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Sep 14, 2019 at 1:04 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> An obvious approach would be to have them sit in the lists hanging off
> dentries, with pointer to dentry in the cursor itself.  It's not hard
> to do, with "move" costing O(#Cursors(dentry1)) and everything else
> being O(1), but I hate adding a pointer to each struct dentry, when
> it's completely useless for most of the filesystems *and* NULL for
> most of dentries on dcache_readdir()-using one.

Yeah, no, I think we should just do the straightforward proper locking
for now, and see if anything even cares.

Last time I think it was a microbenchmark that showed a regression,
not necessarily even a real load (reaim), and there wasn't a _ton_ of
debugging on exactly what triggered the higher system time. It was
also on a fairly unusual system that would show the lock contention
much more than 99+% of anything else.

So I suspect we might have other avenues of improvement than just the
cursor thing. Yes, it would be absolutely lovely to not have the
cursor, and avoid the resulting growth of the dentry child list, which
then makes everything else much more expensive.

But it is also possible that we could avoid some of that O(n**2)
behavior by simply not adding the corsor to the end of the dentry
child list at all. Right now your patch *always* sets the cursor at a
valid point - even if it's at the end of the directory. But we could
skip the "end of the directory" case entirely and just set a flag in
the file for "at eof" instead.

That way the cursor at least wouldn't exist for the common cases when
we return to user space (at the beginning of the readdir and at the
end). Which might make the cursors simply not be quite as common, even
when you have a _lot_ of concurrent readdir() users.

There may be other similar things we could do to minimize the pressure
on the parent dentry lock. For example, maybe we could insert a cursor
only _between_ readdir() calls, but then in the readdir() loop itself,
we know we hold the inode lock shared for the directory, so during
_that_ loop we can use the existing positive dentries that we keep a
refcount to as cursors.

Because if the only reason to not use existing dentry pointers as
cursors is the concurrent rename() issue, then _that_ is certainly
something that the parent inode shared lock should protect against,
even if the child dentry list can change in other ways).

So if the main 'reaim' problem was that the dentry child list itself
grew due to the cursor pressure (and that is likely) and that in turn
then caused the O(n**2) bad locking behavior due to having to walk
much longer dentry child chains, then I suspect that we can do a few
tweaks to simply not make that happen in practice.

Yes, I realize that I'm handwaving a bit on the two above suggestions,
but don't you think that sounds doable?

            Linus



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux