On Fri, May 12, 2023 at 09:25:18AM -0600, Jens Axboe wrote: > On 5/12/23 9:19 AM, Ming Lei wrote: > > On Fri, May 12, 2023 at 09:08:54AM -0600, Jens Axboe wrote: > >> On 5/12/23 9:03?AM, Ming Lei wrote: > >>> Passthrough(pt) request shouldn't be queued to scheduler, especially some > >>> schedulers(such as bfq) supposes that req->bio is always available and > >>> blk-cgroup can be retrieved via bio. > >>> > >>> Sometimes pt request could be part of error handling, so it is better to always > >>> queue it into hctx->dispatch directly. > >>> > >>> Fix this issue by queuing pt request from plug list to hctx->dispatch > >>> directly. > >> > >> Why not just add the check to the BFQ insertion? That would be a lot > >> more trivial and would not be poluting the core with this stuff. > > > > pt request is supposed to be issued to device directly, and we never > > queue it to scheduler before 1c2d2fff6dc0 ("block: wire-up support for > > passthrough plugging"). > > > > some pt request might be part of error handling, and adding it to > > scheduler could cause io hang. > > I'm not suggesting adding it to the scheduler, just having the bypass > "add to dispatch" in a different spot. Originally it is added to dispatch in blk_execute_rq_nowait() for each request, but now we support plug for pt request, that is why I add the bypass in blk_mq_dispatch_plug_list(), and just grab lock for each batch given now blk_execute_rq_nowait() is fast path for nvme uring pt io feature. > > Let me take a look at it... Do we have a reproducer for this issue? Guang Wu and Yu Kuai should have, and I didn't succeed in reproducing it by setting bfq & io.bfq.weight cgroup in my test VM. Thanks, Ming