Re: [bug report] WARNING: CPU: 1 PID: 1386 at block/blk-mq-sched.c:432 blk_mq_sched_insert_request+0x54/0x178

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/3/21 7:59 AM, Yi Zhang wrote:
> On Wed, Nov 3, 2021 at 7:59 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
>>
>> On 11/2/21 9:54 PM, Jens Axboe wrote:
>>> On Nov 2, 2021, at 9:52 PM, Ming Lei <ming.lei@xxxxxxxxxx> wrote:
>>>>
>>>> On Tue, Nov 02, 2021 at 09:21:10PM -0600, Jens Axboe wrote:
>>>>>> On 11/2/21 8:21 PM, Yi Zhang wrote:
>>>>>>>>
>>>>>>>> Can either one of you try with this patch? Won't fix anything, but it'll
>>>>>>>> hopefully shine a bit of light on the issue.
>>>>>>>>
>>>>>> Hi Jens
>>>>>>
>>>>>> Here is the full log:
>>>>>
>>>>> Thanks! I think I see what it could be - can you try this one as well,
>>>>> would like to confirm that the condition I think is triggering is what
>>>>> is triggering.
>>>>>
>>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>>>> index 07eb1412760b..81dede885231 100644
>>>>> --- a/block/blk-mq.c
>>>>> +++ b/block/blk-mq.c
>>>>> @@ -2515,6 +2515,8 @@ void blk_mq_submit_bio(struct bio *bio)
>>>>>    if (plug && plug->cached_rq) {
>>>>>        rq = rq_list_pop(&plug->cached_rq);
>>>>>        INIT_LIST_HEAD(&rq->queuelist);
>>>>> +        WARN_ON_ONCE(q->elevator && !(rq->rq_flags & RQF_ELV));
>>>>> +        WARN_ON_ONCE(!q->elevator && (rq->rq_flags & RQF_ELV));
>>>>>    } else {
>>>>>        struct blk_mq_alloc_data data = {
>>>>>            .q        = q,
>>>>> @@ -2535,6 +2537,8 @@ void blk_mq_submit_bio(struct bio *bio)
>>>>>                bio_wouldblock_error(bio);
>>>>>            goto queue_exit;
>>>>>        }
>>>>> +        WARN_ON_ONCE(q->elevator && !(rq->rq_flags & RQF_ELV));
>>>>> +        WARN_ON_ONCE(!q->elevator && (rq->rq_flags & RQF_ELV));
>>>>
>>>> Hello Jens,
>>>>
>>>> I guess the issue could be the following code run without grabbing
>>>> ->q_usage_counter from blk_mq_alloc_request() and blk_mq_alloc_request_hctx().
>>>>
>>>> .rq_flags       = q->elevator ? RQF_ELV : 0,
>>>>
>>>> then elevator is switched to real one from none, and check on q->elevator
>>>> becomes not consistent.
>>>
>>> Indeed, that’s where I was going with this. I have a patch, testing it
>>> locally but it’s getting late. Will send it out tomorrow. The nice
>>> benefit is that it allows dropping the weird ref get on plug flush,
>>> and batches getting the refs as well.
>>
>> Yi/Steffen, can you try pulling this into your test kernel:
>>
>> git://git.kernel.dk/linux-block for-next
>>
>> and see if it fixes the issue for you. Thanks!
> 
> It still can be reproduced with the latest linux-block/for-next, here
> is the log
> 
> fab2914e46eb (HEAD, new/for-next) Merge branch 'for-5.16/drivers' into
> for-next

Funky! Thanks for re-testing, I guess I need to think even harder about
this. Can't seem to reproduce it here at all, which makes it a bit
harder to poke at.

-- 
Jens Axboe




[Index of Archives]     [Linux Kernel]     [Linux USB Development]     [Yosemite News]     [Linux SCSI]

  Powered by Linux