Re: [PATCH] blk-mq-rdma: remove queue mapping helper for rdma devices

Jason Gunthorpe <jgg@xxxxxxxx> · Thu, 23 Mar 2023 12:57:52 -0300

On Thu, Mar 23, 2023 at 05:07:24PM +0200, Sagi Grimberg wrote:
> 
> > > > > > > No rdma device exposes its irq vectors affinity today. So the only
> > > > > > > mapping that we have left, is the default blk_mq_map_queues, which
> > > > > > > we fallback to anyways. Also fixup the only consumer of this helper
> > > > > > > (nvme-rdma).
> > > > > > 
> > > > > > This was the only caller of ib_get_vector_affinity() so please delete
> > > > > > op get_vector_affinity and ib_get_vector_affinity() from verbs as well
> > > > > 
> > > > > Yep, no problem.
> > > > > 
> > > > > Given that nvme-rdma was the only consumer, do you prefer this goes from
> > > > > the nvme tree?
> > > > 
> > > > Sure, it is probably fine
> > > 
> > > I tried to do it two+ years ago:
> > > https://lore.kernel.org/all/20200929091358.421086-1-leon@xxxxxxxxxx
> > 
> > Christoph's points make sense, but I think we should still purge this
> > code.
> > 
> > If we want to do proper managed affinity the right RDMA API is to
> > directly ask for the desired CPU binding when creating the CQ, and
> > optionally a way to change the CPU binding of the CQ at runtime.
> 
> I think the affinity management is referring to IRQD_AFFINITY_MANAGED
> which IIRC is the case when the device passes `struct irq_affinity` to
> pci_alloc_irq_vectors_affinity.
> 
> Not sure what that has to do with passing a cpu to create_cq.

I took Christoph's remarks to be that the system should auto configure
interrupts sensibly and not rely on userspace messing around in proc.

For instance, I would expect that the NVMe driver work the same way on
RDMA and PCI. For PCI it calls pci_alloc_irq_vectors_affinity(), RDMA
should call some ib_alloc_cq_affinity() and generate the affinity in
exactly the same way.

So, I have no problem to delete these things as the
get_vector_affinity API is not part of solving the affinity problem,
and it seems NVMe PCI doesn't need blk_mq_rdma_map_queues() either.

Jason