Re: Critical bug on bcache kernel module in Fedora 30

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/5/17 12:45 下午, Pierre JUHEN wrote:
> Le 16/05/2019 à 16:12, Coly Li a écrit
> 
> 
>> Such problem is very easy to produce, only a few sequential I/Os may
>> trigger a panic from bch_btree_iter_next_check().
>>
>> Here is content of my fio script,
>> [global]
>> lockmem=1
>> direct=1
>> ioengine=psync
>>
>> [job1]
>> filename=/dev/bcache0
>> readwrite=write
>> blocksize=4k
>> iodepth=1
>> numjobs=1
>> io_size=64K
>>
>> In my observation, 2 sequential I/Os may trigger this panic. Here is the
>> kernel output when panic in bch_btree_iter_next_check() triggered,
>>
>> [  153.478620] bcache: bch_dump_bset() block 1 key 0/9:
>> [  153.478621] bcache: bch_bkey_dump()  0x0:0x10 len 0x8 -> [check dev]
>> [  153.478623] bcache: bch_bkey_dump()  bucket 7560168717
>> [  153.478624] bcache: bch_bkey_dump()
>> [  153.478624]
>> [  153.478625] bcache: bch_dump_bset() block 1 key 3/9:
>> [  153.478626] bcache: bch_bkey_dump()  0x0:0x18 len 0x8 -> [check dev]
>> [  153.478627] bcache: bch_bkey_dump()  bucket 4400861924
>> [  153.478628] bcache: bch_bkey_dump()
>> [  153.478628]
>> [  153.478629] bcache: bch_dump_bset() Key skipped backwards
>> [  153.478629]
>> [  153.478630] bcache: bch_dump_bset() block 1 key 6/9:
>> [  153.478631] bcache: bch_bkey_dump()  0x0:0x10 len 0x8 -> [0:392192
>> gen 1] dirty
>> [  153.478632] bcache: bch_bkey_dump()  bucket 383
>> [  153.478635] bcache: bch_bkey_dump()
>> [  153.478635]
>> [  153.532890] Kernel panic - not syncing: Key skipped backwards
>> [  153.535924] CPU: 0 PID: 790 Comm: bcache_writebac Tainted: G        W
>>          5.1.0+ #3
>> [  153.539656] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
>> Desktop Reference Platform, BIOS 6.00 04/13/2018
>> [  153.545002] Call Trace:
>> [  153.546702]  dump_stack+0x85/0xc0
>> [  153.548675]  panic+0x106/0x2da
>> [  153.550560]  ? bch_ptr_invalid+0x10/0x10 [bcache]
>> [  153.553178]  bch_btree_iter_next_filter.cold+0xff/0x12e [bcache]
>> [  153.556117]  ? btree_insert_key+0x190/0x190 [bcache]
>> [  153.558646]  bch_btree_map_keys_recurse+0x5c/0x190 [bcache]
>> [  153.561557]  bch_btree_map_keys+0x177/0x1a0 [bcache]
>> [  153.564085]  ? btree_insert_key+0x190/0x190 [bcache]
>> [  153.566688]  ? dirty_init+0x80/0x80 [bcache]
>> [  153.569224]  bch_refill_keybuf+0xcc/0x290 [bcache]
>> [  153.571609]  ? finish_wait+0x90/0x90
>> [  153.573525]  ? dirty_init+0x80/0x80 [bcache]
>> [  153.575705]  bch_writeback_thread+0x3b9/0x5c0 [bcache]
>> [  153.578527]  ? __kthread_parkme+0x58/0x80
>> [  153.580662]  kthread+0x108/0x140
>> [  153.582541]  ? read_dirty+0x620/0x620 [bcache]
>> [  153.584998]  ? kthread_park+0x90/0x90
>> [  153.586991]  ret_from_fork+0x3a/0x50
>> [  153.589236] Kernel Offset: 0x12000000 from 0xffffffff81000000
>> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>> [  153.594868] ---[ end Kernel panic - not syncing: Key skipped
>> backwards ]---
>>
>> The panic happens in bch_writeback context, because bkeys in btree node
>> is not in linear increasing order. Adjacent two sequential write
>> requests is very common condition in bcache, such corrupted btree node
>> is not reported in recent 2~3 years. Unless the kernel is compiled with
>> gcc9...
>>
>> It is not clear to me why the key 0:0x10 appears in same btree node
>> twice, and why there are 3 keys for two 4K write requests.
>>
>> If anyone may have any clue, please offer. Now I continue to check how
>> this happens.
>>
>> Thanks.
>>
> Hi,
> 
> Either Rolf and I didn't experiment a kernel panic, probably because we
> are working a bcache device already used at 100% or close.
> 

It seems something wrong in btree related code with GCC9.

> The trace shows that's the btree structure that is corrupted, and
> probably the adresses of the next btree nodes. This could be in the
> tasks that walk's inside the btree :
> 
> Could you check that the size of each btree node is the same with the
> two versions of the compiler  ? (alignment issue)
> 

Not yet, but since the keys are combined with 64bit words, it should not
be problematic for bits alignment. bkey assignment is not by data
structure, is by several uint64 words, this is reliable operation.


> It could be also a problem where the end of the btree are not correctly
> detected and you jump on un-itinialized node.
> 

This is suspicious. I only sent two 4K write, but there are three keys
in the btree node. Currently I am not fully understand how bcache btree
code works, still on the way to dig out.

> Since you are able to reproduce it very fast, could you put a trace that
> dump the btree nodes that are crossed during the search ?

Let me try.

Thanks.


-- 

Coly Li



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux