On Tue, Jun 29, 2021 at 02:39:14PM +0200, Hannes Reinecke wrote: > On 6/29/21 9:49 AM, Ming Lei wrote: > > hctx is deactivated when all CPU in hctx->cpumask become offline by > > draining all requests originated from this hctx and moving new > > allocation to active hctx. This way is for avoiding inflight IO when > > the managed irq is shutdown. > > > > Some drivers(nvme fc, rdma, tcp, loop) doesn't use managed irq, so > > they needn't to deactivate hctx. Also, they are the only user of > > blk_mq_alloc_request_hctx() which is used for connecting io queue. > > And their requirement is that the connect request can be submitted > > via one specified hctx on which all CPU in its hctx->cpumask may have > > become offline. > > > > How can you submit a connect request for a hctx on which all CPUs are > offline? That hctx will be unusable as it'll never be able to receive > interrupts ... I believe BLK_MQ_F_NOT_USE_MANAGED_IRQ is self-explanatory. And the interrupt(non-managed) of this hctx will be migrated to online CPUs, see migrate_one_irq(). For managed irq, we have to prevent new allocation if all CPUs of this hctx is offline because genirq will shutdown the interrupt. > > > Address the requirement for nvme fc/rdma/loop, so the reported kernel > > panic on the following line in blk_mq_alloc_request_hctx() can be fixed. > > > > data.ctx = __blk_mq_get_ctx(q, cpu) > > > > Cc: Sagi Grimberg <sagi@xxxxxxxxxxx> > > Cc: Daniel Wagner <dwagner@xxxxxxx> > > Cc: Wen Xiong <wenxiong@xxxxxxxxxx> > > Cc: John Garry <john.garry@xxxxxxxxxx> > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > > --- > > block/blk-mq.c | 6 +++++- > > include/linux/blk-mq.h | 1 + > > 2 files changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index df5dc3b756f5..74632f50d969 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -494,7 +494,7 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, > > data.hctx = q->queue_hw_ctx[hctx_idx]; > > if (!blk_mq_hw_queue_mapped(data.hctx)) > > goto out_queue_exit; > > - cpu = cpumask_first_and(data.hctx->cpumask, cpu_online_mask); > > + cpu = cpumask_first(data.hctx->cpumask); > > data.ctx = __blk_mq_get_ctx(q, cpu); > > I don't get it. > Doesn't this allow us to allocate a request on a dead cpu, ie the very thing > we try to prevent? It is fine to allocate & dispatch one request to the hctx when all CPU on its cpumask are offline if this hctx's interrupt isn't managed. Thanks, Ming