Re: [PATCH V7 3/3] blk-mq: don't deactivate hctx if managed irq isn't used

Daniel Wagner <dwagner@xxxxxxx> · Thu, 16 Sep 2021 09:42:29 +0200

On Thu, Sep 16, 2021 at 10:17:18AM +0800, Ming Lei wrote:
> Firstly, even with patches of 'qla2xxx - add nvme map_queues support',
> the knowledge if managed irq is used in nvmef LLD is still missed, so
> blk_mq_hctx_use_managed_irq() may always return false, but that
> shouldn't be hard to solve.

Yes, that's pretty simple:

--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -7914,6 +7914,9 @@ static int qla2xxx_map_queues(struct Scsi_Host *shost)
                rc = blk_mq_map_queues(qmap);
        else
                rc = blk_mq_pci_map_queues(qmap, vha->hw->pdev, vha->irq_offset);
+
+       qmap->use_managed_irq = true;
+
        return rc;
 }

> The problem is that we still should make connect io queue completed
> when all CPUs of this hctx is offline in case of managed irq.

I agree, though if I understand this right, the scenario where all CPUs
are offline in a hctx and we want to use this htcx is only happening
after an initial setup and then reconnect attempt happens. That is
during the first connect attempt only online CPUs are assigned to the
hctx. When the CPUs are taken offline the block layer makes sure not to
use those queues anymore (no problem for the hctx so far). Then for some
reason the nmve-fc layer decides to reconnect and we end up in the
situation where we don't have any online CPU in given hctx.

> One solution might be to use io polling for connecting io queue, but nvme fc
> doesn't support polling, all the other nvme hosts do support it.

No idea, something to explore for sure :)

My point is that your series is fixing existing bugs and doesn't
introduce a new one. qla2xxx is already depending on managed IRQs. I
would like to see your series accepted with my hack as stop gap solution
until we have a proper fix.