Re: [PATCH] blk-mq: don't queue passthrough request into scheduler

Ming Lei <ming.lei@xxxxxxxxxx> · Fri, 12 May 2023 23:34:59 +0800

On Fri, May 12, 2023 at 09:25:18AM -0600, Jens Axboe wrote:
> On 5/12/23 9:19 AM, Ming Lei wrote:
> > On Fri, May 12, 2023 at 09:08:54AM -0600, Jens Axboe wrote:
> >> On 5/12/23 9:03?AM, Ming Lei wrote:
> >>> Passthrough(pt) request shouldn't be queued to scheduler, especially some
> >>> schedulers(such as bfq) supposes that req->bio is always available and
> >>> blk-cgroup can be retrieved via bio.
> >>>
> >>> Sometimes pt request could be part of error handling, so it is better to always
> >>> queue it into hctx->dispatch directly.
> >>>
> >>> Fix this issue by queuing pt request from plug list to hctx->dispatch
> >>> directly.
> >>
> >> Why not just add the check to the BFQ insertion? That would be a lot
> >> more trivial and would not be poluting the core with this stuff.
> > 
> > pt request is supposed to be issued to device directly, and we never
> > queue it to scheduler before 1c2d2fff6dc0 ("block: wire-up support for
> > passthrough plugging").
> > 
> > some pt request might be part of error handling, and adding it to
> > scheduler could cause io hang.
> 
> I'm not suggesting adding it to the scheduler, just having the bypass
> "add to dispatch" in a different spot.

Originally it is added to dispatch in blk_execute_rq_nowait() for each
request, but now we support plug for pt request, that is why I add the
bypass in blk_mq_dispatch_plug_list(), and just grab lock for each batch
given now blk_execute_rq_nowait() is fast path for nvme uring pt io feature.

> 
> Let me take a look at it... Do we have a reproducer for this issue?

Guang Wu and Yu Kuai should have, and I didn't succeed in reproducing
it by setting bfq & io.bfq.weight cgroup in my test VM.

Thanks, 
Ming