Re: Race condition introduced in 4bf46a27 VFS: Impose ordering on accesses of d_inode and d_flags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dominique Martinet wrote on Wed, Jul 22, 2015 at 05:45:19PM +0200:
> I'm getting a sneaky race condition failure since this commit (after
> painstakingly bisecting it through the day):
> 
> commit 4bf46a272647d89e780126b52eda04737defd9f4
> Author: David Howells <dhowells@xxxxxxxxxx>
> Date:   Thu Mar 5 14:09:22 2015 +0000
> 
>     VFS: Impose ordering on accesses of d_inode and d_flags
> 
>     Impose ordering on accesses of d_inode and d_flags to avoid the need
>     to do this:
>     
>         if (!dentry->d_inode || d_is_negative(dentry)) {
>     
>     when this:
>     
>         if (d_is_negative(dentry)) {
>     
>     should suffice.
>     
>     This check is especially problematic if a dentry can have its type
>     field set to something other than DENTRY_MISS_TYPE when d_inode is
>     NULL (as in unionmount).
>     
>     What we really need to do is stick a write barrier between setting
>     d_inode and setting d_flags and a read barrier between reading
>     d_flags and reading d_inode.
>     
> 
> 
> 
> Test run:

Ok so, instead of kernel compile, I got a simpler reproducer, on a virtio-9P
mount:

  #!/bin/bash
  for dir in {0..2}; do
    mkdir -p d.${dir}/{d,d2}/d/d
    echo foo > d.${dir}/d/d/d/f
  done

  function foo() {
    # return code 255 tells xargs to stop immediately
    cat d.$(($1%3))/{d,d2}/{d,d2}/d/f | grep -q foo || return 255
  }
  export -f foo

  date
  while ! seq 1 30000 | xargs -n 1 -P 16 -I{} bash -c "foo {}" 2>&1 |\
            grep -E 'd.[0-9]/d/d/d/f'; do
     date
  done


After looking at what the compilation was doing, basically trying to
open a lot of header files where there is done then going forward to the
next directory etc, the reproducer needs to try to open non-existing
entries to hit the bug.

This seems to be getting ENOENT way more often than ENOTDIR though,
since it's one ENOENT in the middle of plenty of "valid" ENOENT it's not
easy to debug...

I also unfortunately couldn't reproduce on my 2-core+hyperthreading
laptop, the host where the test fails is a bi-socket 8 core+ht so 32
logical core.
Since it's a memory barrier issue, what might help most is the bisocket
part, I'd try to pin tasks but it doesn't strike me as obvious from
inside a VM.


> A co-worker pointed out that __d_set_inode_and_type clears d_flags for
> DCACHE_ENTRY_TYPE and DCACHE_FALLTHRU, and __d_obtain_alias didn't clear
> these before, but I still get the same error if I don't (although it did
> seem longer to reproduce).
> I have no idea what __d_obtain_alias does or if it's ok if it clears it,
> just saying I tried.

Could use an enlightened opinion on wether __d_obtain_alias should clear
DCACHE_ENTRY_TYPE and DCACHE_FALLTHU, independantly of the problem at
hand?


> My guess is that I'm just seeing a race condition that already existed
> but the barriers make it easier to reproduce it.
> This commit came with 2b0143b5c9 "VFS: normal filesystems (and lustre):
> d_inode() annotations" which made 9P access d_inode through the helper,
> which could have helped with ordering, but I'm still hitting the bug
> "easily" with that commit.

Still looking for ideas to help diagnose further...

-- 
Dominique
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux