Re: [PATCH] Re: Possible FS race condition between iterate_dir and d_alloc_parallel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/9/23 5:29, Al Viro wrote:

> On Sat, Sep 14, 2019 at 09:49:21AM -0700, Linus Torvalds wrote:
>> On Sat, Sep 14, 2019 at 9:16 AM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>>>         OK, folks, could you try the following?  It survives the local beating
>>> so far.
>> This looks like the right solution to me. Keep the locking simple,
>> take the dentry refcount as long as we keep a ref to it in "*res".
> *grumble*
>
> Example of subtleties in the whole mess: this is safe for mainline
> now, but only due to "devpts_pty_kill(): don't bother with d_delete()"
> already merged.  Without that, we are risking the following fun:
>
> scan_positive() on devpts: finds a dentry, sees it positive, decides
> to grab the sucker.  Refcount is currently 1 (will become 2 after
> we grab the reference).
>
> devpts_pty_kill(): d_delete(dentry); on that sucker.  Refcount is
> currently (still) 1, so we simply make it negative.
>
> scan_positive(): grabs an extra reference to now negative dentry.
>
> devpts_pty_kill(): dput() drops refcount to 1 (what if it got there
> before scan_positive() grabbed a reference?  Nothing, really, since
> scan_positive() is holding parent's ->d_lock; dput() wouldn't
> have progressed through dentry_kill() until it managed to get
> that, and it would've rechecked the refcount.  So that's not
> a problem)
>
> scan_positive(): returns a reference to negative dentry to
> dcache_readdir().  Which proceeds to oops on
>                 if (!dir_emit(ctx, next->d_name.name, next->d_name.len,
>                               d_inode(next)->i_ino, dt_type(d_inode(next))))
> since d_inode(next) is NULL.
>
> With the aforementioned commit it *is* safe, since the dentry remains
> positive (and unhashed), so we simply act as if dcache_readdir() has
> won the race and emitted a record for the sucker killed by devpts_pty_kill().
>
> IOW, backports will either need to bring that commit in as well, or
> they'll need to play silly buggers along the lines of
>
> 		if (simple_positive(d) && !--count) {
> 			spin_lock(&d->d_lock);
> 			if (likely(simple_positive(d)))
> 				found = dget_dlock(d);
> 			spin_unlock(&d->d_lock);
> 			if (found)
> 				break;
> 			count = 1;	// it's gone, keep scanning
>                 }
> Probably the latter, since it's less dependent on some other place
> doing what devpts used to do...

Is it possible to trigger ABBA deadlock? In next_positive, the lock order is (parent, dentry),

while in dput, the lock order is (dentry, parent). Cause we use spin_trylock(parent), so the ABBA deadlock will not open.

dput

    fast_dput

       spin_lock(&dentry->d_lock)

   dentry_kill

       spin_trylock(&parent->d_lock))

Is there any other scene like dput, but do not use spin_trylock? I am looking for the code, till now do not find this

> .
>




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux