On 11/3/21 7:59 AM, Yi Zhang wrote: > On Wed, Nov 3, 2021 at 7:59 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> On 11/2/21 9:54 PM, Jens Axboe wrote: >>> On Nov 2, 2021, at 9:52 PM, Ming Lei <ming.lei@xxxxxxxxxx> wrote: >>>> >>>> On Tue, Nov 02, 2021 at 09:21:10PM -0600, Jens Axboe wrote: >>>>>> On 11/2/21 8:21 PM, Yi Zhang wrote: >>>>>>>> >>>>>>>> Can either one of you try with this patch? Won't fix anything, but it'll >>>>>>>> hopefully shine a bit of light on the issue. >>>>>>>> >>>>>> Hi Jens >>>>>> >>>>>> Here is the full log: >>>>> >>>>> Thanks! I think I see what it could be - can you try this one as well, >>>>> would like to confirm that the condition I think is triggering is what >>>>> is triggering. >>>>> >>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>>>> index 07eb1412760b..81dede885231 100644 >>>>> --- a/block/blk-mq.c >>>>> +++ b/block/blk-mq.c >>>>> @@ -2515,6 +2515,8 @@ void blk_mq_submit_bio(struct bio *bio) >>>>> if (plug && plug->cached_rq) { >>>>> rq = rq_list_pop(&plug->cached_rq); >>>>> INIT_LIST_HEAD(&rq->queuelist); >>>>> + WARN_ON_ONCE(q->elevator && !(rq->rq_flags & RQF_ELV)); >>>>> + WARN_ON_ONCE(!q->elevator && (rq->rq_flags & RQF_ELV)); >>>>> } else { >>>>> struct blk_mq_alloc_data data = { >>>>> .q = q, >>>>> @@ -2535,6 +2537,8 @@ void blk_mq_submit_bio(struct bio *bio) >>>>> bio_wouldblock_error(bio); >>>>> goto queue_exit; >>>>> } >>>>> + WARN_ON_ONCE(q->elevator && !(rq->rq_flags & RQF_ELV)); >>>>> + WARN_ON_ONCE(!q->elevator && (rq->rq_flags & RQF_ELV)); >>>> >>>> Hello Jens, >>>> >>>> I guess the issue could be the following code run without grabbing >>>> ->q_usage_counter from blk_mq_alloc_request() and blk_mq_alloc_request_hctx(). >>>> >>>> .rq_flags = q->elevator ? RQF_ELV : 0, >>>> >>>> then elevator is switched to real one from none, and check on q->elevator >>>> becomes not consistent. >>> >>> Indeed, that’s where I was going with this. I have a patch, testing it >>> locally but it’s getting late. Will send it out tomorrow. The nice >>> benefit is that it allows dropping the weird ref get on plug flush, >>> and batches getting the refs as well. >> >> Yi/Steffen, can you try pulling this into your test kernel: >> >> git://git.kernel.dk/linux-block for-next >> >> and see if it fixes the issue for you. Thanks! > > It still can be reproduced with the latest linux-block/for-next, here > is the log > > fab2914e46eb (HEAD, new/for-next) Merge branch 'for-5.16/drivers' into > for-next Funky! Thanks for re-testing, I guess I need to think even harder about this. Can't seem to reproduce it here at all, which makes it a bit harder to poke at. -- Jens Axboe