Re: [PATCH] block: fix request.queuelist usage in flush

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024/6/6 16:44, Friedrich Weber wrote:
> On 05/06/2024 16:27, Chengming Zhou wrote:
>> On 2024/6/5 21:34, Friedrich Weber wrote:
>>> On 05/06/2024 12:54, Friedrich Weber wrote:
>>> [...]
>>>
>>> My results:
>>>
>>> Booting the Debian (virtual) machine with mainline kernel v6.10-rc2
>>> (c3f38fa61af77b49866b006939479069cd451173):
>>> works fine, no crash
>>>
>>> Booting the Debian (virtual) machine with patch "block: fix
>>> request.queuelist usage in flush" applied on top of v6.10-rc2: The
>>> Debian (virtual) machine crashes during boot with [1].
>>>
>>> Hope this helps! If I can provide anything else, just let me know.
>>
>> Thanks for your help, I still can't reproduce it myself, don't know why.
> 
> Weird -- when booting the Debian machine into mainline kernel v6.10-rc2
> with "block: fix request.queuelist usage in flush" applied on top, it
> crashes reliably for me. The machine having its root on LVM seems to be
> essential to reproduce the crash, though.

Yeah, right, it seems LVM may create this special request that only has
PREFLUSH | POSTFLUSH without any DATA, goes into the flush state machine.
Then, cause the request double list_add_tail() without list_del_init().
I don't know the reason behind it, but well, it's allowable in the current
flush code.

> 
> Maybe the fact that I'm running the Debian machine virtualized makes the
> crash more likely to trigger. I'll try to reproduce on bare metal to
> narrow down the reproducer and get back to you.

Thanks much for your very detailed process on that thread!

> 
>> Could you help to test with this diff?
>>
>> diff --git a/block/blk-flush.c b/block/blk-flush.c
>> index e7aebcf00714..cca4f9131f79 100644
>> --- a/block/blk-flush.c
>> +++ b/block/blk-flush.c
>> @@ -263,6 +263,7 @@ static enum rq_end_io_ret flush_end_io(struct request *flush_rq,
>>                 unsigned int seq = blk_flush_cur_seq(rq);
>>
>>                 BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH);
>> +               list_del_init(&rq->queuelist);
>>                 blk_flush_complete_seq(rq, fq, seq, error);
>>         }
> 
> I used mainline kernel v6.10-rc2 as base and applied:
> 
> - "block: fix request.queuelist usage in flush"
> - Your `list_del_init` addition from above
> 
> and if I boot the Debian machine into this kernel, I do not get the
> crash anymore.

Good to hear. So can I merge these two diffs into one patch and add
your Tested-by?

> 
> Happy to run more tests for you, just let me know.

Thanks again!




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux