Re: Possible FS race condition between iterate_dir and d_alloc_parallel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/9/3 23:41, Al Viro wrote:

> On Tue, Sep 03, 2019 at 04:40:07PM +0100, Al Viro wrote:
>> On Tue, Sep 03, 2019 at 10:44:32PM +0800, zhengbin (A) wrote:
>>> We recently encountered an oops(the filesystem is tmpfs)
>>> crash> bt
>>>  #9 [ffff0000ae77bd60] dcache_readdir at ffff0000672954bc
>>>
>>> The reason is as follows:
>>> Process 1 cat test which is not exist in directory A, process 2 cat test in directory A too.
>>> process 3 create new file in directory B, process 4 ls directory A.
>>
>> good grief, what screen width do you have to make the table below readable?
>>
>> What I do not understand is how the hell does your dtry2 manage to get actually
>> freed and reused without an RCU delay between its removal from parent's
>> ->d_subdirs and freeing its memory.  What should've happened in that
>> scenario is
>> 	* process 4, in next_positive() grabs rcu_read_lock().
>> 	* it walks into your dtry2, which might very well be
>> just a chunk of memory waiting to be freed; it sure as hell is
>> not positive.  skipped is set to true, 'i' is not decremented.
>> Note that ->d_child.next points to the next non-cursor sibling
>> (if any) or to the ->d_subdir of parent, so we can keep walking.
>> 	* we keep walking for a while; eventually we run out of
>> counter and leave the loop.
>>
>> Only after that we do rcu_read_unlock() and only then anything
>> observed in that loop might be freed and reused.
You are right, I miss this.
>>
>> Confused...  OTOH, I might be misreading that table of yours -
>> it's about 30% wider than the widest xterm I can get while still
>> being able to read the font...

The table is my guess. This oops happens sometimes

(We have one vmcore, others just have log, and the backtrace is same with vmcore, so the reason should be same).

Unfortunately, we do not know how to reproduce it. The vmcore has such a law:

1、dirA has 177 files, and it is OK

2、dirB has 25 files, and it is OK

3、When we ls dirA, it begins with ".", "..", dirB's first file, second file... last file,  last file->next = &(dirB->d_subdirs)

-------->

crash> struct dir_context ffff0000ae77be30  --->dcache_readdir ctx

struct dir_context {

actor = 0xffff00006727d760 <filldir64>,

pos = 27   --->27 = . + .. + 25 files

}


next_positive

  for (p = from->next; p != &parent->d_subdirs; p = p->next)  --->parent is dirA, so will continue


This should be a bug, I think it is related with locks,  especially with commit ebaaa80e8f20 ("lockless next_positive()").

Howerver, until now, I do not find the reason, Any suggestions?
> Incidentally, which kernel was that on?
4.19-stable,  the code of iterate_dir and d_alloc_parallel is same with master
>
> .
>




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux