On Fri, Jul 16, 2021 at 03:01:54PM -0500, Bjorn Helgaas wrote: > On Thu, Jul 15, 2021 at 08:08:42PM +0800, Ming Lei wrote: > > irq vector allocation with managed affinity may be used by driver, and > > blk-mq needs this info because managed irq will be shutdown when all > > CPUs in the affinity mask are offline. > > > > The info of using managed irq is often produced by drivers(pci subsystem, > > Add space between "drivers" and "(". > s/pci/PCI/ OK. > > Does this "managed IRQ" (or "managed affinity", not sure what the > correct terminology is here) have something to do with devm? > > > platform device, ...), and it is consumed by blk-mq, so different subsystems > > are involved in this info flow > > Add period at end of sentence. OK. > > > Address this issue by adding one field of .irq_affinity_managed into > > 'struct device'. > > > > Suggested-by: Christoph Hellwig <hch@xxxxxx> > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > > --- > > drivers/base/platform.c | 7 +++++++ > > drivers/pci/msi.c | 3 +++ > > include/linux/device.h | 1 + > > 3 files changed, 11 insertions(+) > > > > diff --git a/drivers/base/platform.c b/drivers/base/platform.c > > index 8640578f45e9..d28cb91d5cf9 100644 > > --- a/drivers/base/platform.c > > +++ b/drivers/base/platform.c > > @@ -388,6 +388,13 @@ int devm_platform_get_irqs_affinity(struct platform_device *dev, > > ptr->irq[i], ret); > > goto err_free_desc; > > } > > + > > + /* > > + * mark the device as irq affinity managed if any irq affinity > > + * descriptor is managed > > + */ > > + if (desc[i].is_managed) > > + dev->dev.irq_affinity_managed = true; > > } > > > > devres_add(&dev->dev, ptr); > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > > index 3d6db20d1b2b..7ddec90b711d 100644 > > --- a/drivers/pci/msi.c > > +++ b/drivers/pci/msi.c > > @@ -1197,6 +1197,7 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs, > > if (flags & PCI_IRQ_AFFINITY) { > > if (!affd) > > affd = &msi_default_affd; > > + dev->dev.irq_affinity_managed = true; > > This is really opaque to me. I can't tell what the connection between > PCI_IRQ_AFFINITY and irq_affinity_managed is. Comment for PCI_IRQ_AFFINITY is 'Auto-assign affinity', 'irq_affinity_managed' basically means that irq's affinity is managed by kernel. What blk-mq needs is exactly if PCI_IRQ_AFFINITY is applied when allocating irq vectors. When PCI_IRQ_AFFINITY is used, genirq will shutdown the irq when all CPUs in the assigned affinity are offline, then blk-mq has to drain all in-flight IOs which will be completed via this irq and prevent new IO. That is the connection. Or you think 'irq_affinity_managed' isn't named well? > > AFAICT the only place irq_affinity_managed is ultimately used is > blk_mq_hctx_notify_offline(), and there's no obvious connection > between that and this code. I believe the connection is described in comment. Thanks, Ming