On Tue, 3 Sep 2024 15:07:45 +0800 Kai-Heng Feng <kai.heng.feng@xxxxxxxxxxxxx> wrote: > On Tue, Sep 3, 2024 at 12:29 PM Manivannan Sadhasivam > <manivannan.sadhasivam@xxxxxxxxxx> wrote: > > > > On Tue, Sep 03, 2024 at 10:55:44AM +0800, Kai-Heng Feng wrote: > > > Meteor Lake VMD has a bug that the IRQ raises before the DMA > > > region is ready, so the requested IO is considered never > > > completed: [ 97.343423] nvme nvme0: I/O 259 QID 2 timeout, > > > completion polled [ 97.343446] nvme nvme0: I/O 384 QID 3 > > > timeout, completion polled [ 97.343459] nvme nvme0: I/O 320 QID > > > 4 timeout, completion polled [ 97.343470] nvme nvme0: I/O 707 > > > QID 5 timeout, completion polled > > > > > > The is documented as erratum MTL016 [0]. The suggested workaround > > > is to "The VMD MSI interrupt-handler should initially perform a > > > dummy register read to the MSI initiator device prior to any > > > writes to ensure proper PCIe ordering." which essentially is > > > adding a delay before the interrupt handling. > > > > > > > Why can't you add a dummy register read instead? Adding a delay for > > PCIe ordering is not going to work always. > > This can be done too. But it can take longer than 4us delay, so I'd > like to keep it a bit faster here. Since the issue is due to the bug in silicon and we have limited options. If community is okay to take some performance hit, then please add more details about the patch above VMD_FEAT_INTERRUPT_QUIRK. -nirmal > > > > > > Hence add a delay before handle interrupt to workaround the > > > erratum. > > > > > > [0] > > > https://edc.intel.com/content/www/us/en/design/products/platforms/details/meteor-lake-u-p/core-ultra-processor-specification-update/errata-details/#MTL016 > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217871 > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@xxxxxxxxxxxxx> > > > --- > > > drivers/pci/controller/vmd.c | 18 ++++++++++++++++-- > > > 1 file changed, 16 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/pci/controller/vmd.c > > > b/drivers/pci/controller/vmd.c index a726de0af011..3433b3730f9c > > > 100644 --- a/drivers/pci/controller/vmd.c > > > +++ b/drivers/pci/controller/vmd.c > > > @@ -16,6 +16,7 @@ > > > #include <linux/srcu.h> > > > #include <linux/rculist.h> > > > #include <linux/rcupdate.h> > > > +#include <linux/delay.h> > > > > > > #include <asm/irqdomain.h> > > > > > > @@ -74,6 +75,9 @@ enum vmd_features { > > > * proper power management of the SoC. > > > */ > > > VMD_FEAT_BIOS_PM_QUIRK = (1 << 5), > > > + > > > + /* Erratum MTL016 */ > > > + VMD_FEAT_INTERRUPT_QUIRK = (1 << 6), > > > }; > > > > > > #define VMD_BIOS_PM_QUIRK_LTR 0x1003 /* 3145728 ns */ > > > @@ -90,6 +94,8 @@ static DEFINE_IDA(vmd_instance_ida); > > > */ > > > static DEFINE_RAW_SPINLOCK(list_lock); > > > > > > +static bool interrupt_delay; > > > + > > > /** > > > * struct vmd_irq - private data to map driver IRQ to the VMD > > > shared vector > > > * @node: list item for parent traversal. > > > @@ -105,6 +111,7 @@ struct vmd_irq { > > > struct vmd_irq_list *irq; > > > bool enabled; > > > unsigned int virq; > > > + bool delay_irq; > > > > This is unused. Perhaps you wanted to use this instead of > > interrupt_delay? > > This is leftover, will scratch this. > > Kai-Heng > > > > > - Mani > > > > > }; > > > > > > /** > > > @@ -680,8 +687,11 @@ static irqreturn_t vmd_irq(int irq, void > > > *data) int idx; > > > > > > idx = srcu_read_lock(&irqs->srcu); > > > - list_for_each_entry_rcu(vmdirq, &irqs->irq_list, node) > > > + list_for_each_entry_rcu(vmdirq, &irqs->irq_list, node) { > > > + if (interrupt_delay) > > > + udelay(4); > > > generic_handle_irq(vmdirq->virq); > > > + } > > > srcu_read_unlock(&irqs->srcu, idx); > > > > > > return IRQ_HANDLED; > > > @@ -1015,6 +1025,9 @@ static int vmd_probe(struct pci_dev *dev, > > > const struct pci_device_id *id) if (features & > > > VMD_FEAT_OFFSET_FIRST_VECTOR) vmd->first_vec = 1; > > > > > > + if (features & VMD_FEAT_INTERRUPT_QUIRK) > > > + interrupt_delay = true; > > > + > > > spin_lock_init(&vmd->cfg_lock); > > > pci_set_drvdata(dev, vmd); > > > err = vmd_enable_domain(vmd, features); > > > @@ -1106,7 +1119,8 @@ static const struct pci_device_id vmd_ids[] > > > = { {PCI_VDEVICE(INTEL, 0xa77f), > > > .driver_data = VMD_FEATS_CLIENT,}, > > > {PCI_VDEVICE(INTEL, 0x7d0b), > > > - .driver_data = VMD_FEATS_CLIENT,}, > > > + .driver_data = VMD_FEATS_CLIENT | > > > + VMD_FEAT_INTERRUPT_QUIRK,}, > > > {PCI_VDEVICE(INTEL, 0xad0b), > > > .driver_data = VMD_FEATS_CLIENT,}, > > > {PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_VMD_9A0B), > > > -- > > > 2.43.0 > > > > > > > -- > > மணிவண்ணன் சதாசிவம்