On Tue, 2014-03-18 at 09:32 +0800, Gavin Shan wrote: > On Mon, Mar 17, 2014 at 04:16:22PM -0600, Alex Williamson wrote: > >On Mon, 2014-03-10 at 13:46 +0800, Gavin Shan wrote: > >> The problem is specific to the case of BIST reset issued to IPR > >> adapter on the guest side. The IPR driver calls pci_save_state(), > >> issues BIST reset and then pci_restore_state(). The issued BIST > >> cleans out MSIx table and pci_restore_state() doesn't restore > >> the MSIx table as expected. Eventually, MSIx messages with all > >> zeros are sent and causes EEH error on Power platform. > >> > >> The patch fixes it by writing MSIx message to MSIx table before > >> reenabling the individual interrupts in the following path: > >> > >> qemu/hw/pci/msix.c::msix_table_mmio_write > >> msix_handle_mask_update > >> msix_fire_vector_notifier > >> qemu/hw/misc/vfio.c::vfio_msix_vector_use > >> vfio_msix_vector_do_use > >> > >> IOCTL Command VFIO_DEVICE_SET_IRQS to VFIO-PCI > >> vfio/pci/vfio_pci.c::vfio_pci_ioctl > >> vfio/pci/vfio_pci_intrs.c::vfio_pci_set_irqs_ioctl > >> vfio_pci_set_msi_trigger > >> vfio_msi_set_block > >> vfio_msi_set_vector_signal > >> > >> Reported-by: Wen Xiong <wenxiong@xxxxxxxxxx> > >> Signed-off-by: Gavin Shan <shangw@xxxxxxxxxxxxxxxxxx> > >> --- > >> drivers/vfio/pci/vfio_pci_intrs.c | 11 +++++++++++ > >> 1 file changed, 11 insertions(+) > >> > >> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c > >> index 2103576..83e0638 100644 > >> --- a/drivers/vfio/pci/vfio_pci_intrs.c > >> +++ b/drivers/vfio/pci/vfio_pci_intrs.c > >> @@ -17,6 +17,7 @@ > >> #include <linux/interrupt.h> > >> #include <linux/eventfd.h> > >> #include <linux/pci.h> > >> +#include <linux/msi.h> > >> #include <linux/file.h> > >> #include <linux/poll.h> > >> #include <linux/vfio.h> > >> @@ -517,6 +518,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev, > >> struct pci_dev *pdev = vdev->pdev; > >> int irq = msix ? vdev->msix[vector].vector : pdev->irq + vector; > >> char *name = msix ? "vfio-msix" : "vfio-msi"; > >> + struct msi_msg msg; > >> struct eventfd_ctx *trigger; > >> int ret; > >> > >> @@ -544,6 +546,15 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev, > >> return PTR_ERR(trigger); > >> } > >> > >> + /* We possiblly lose the MSI/MSIx message in some cases, one > >> + * of which is pci_save_state(), BIST reset and pci_restore_state() > >> + * for IPR adapter. The BIST reset cleans out MSIx table and we > >> + * don't have chance to restore it. Here, we have the trick to > >> + * restore it before enabling individual interrupts. For MSI messages, > >> + * it's harmless to write them back. > >> + */ > >> + get_cached_msi_msg(irq, &msg); > >> + write_msi_msg(irq, &msg); > >> ret = request_irq(irq, vfio_msihandler, 0, > >> vdev->ctx[vector].name, trigger); > >> if (ret) { > > > > > >Hi Gavin, > > > >The commit log and patch comment are still pretty confusing. The > >problem is not necessarily specific to the IPR card, it's just the only > >instance we know of it, otherwise we'd make a quirk that only gets > >invoked for this device. We're also not only making the change for the > >call path listed, but for any callers of the function. I also got a bit > >nervous that while it seems like it should be safe to re-write the > >cached MSI message before enabling it, our argument for doing so is that > >the MSIx vector table lives in device memory. This is not true for MSI > >and modifying MSI modifies all the vectors, so why are we doing it? The > >result is the patch below. Please let me know if you approve and I'll > >apply it after further testing. Thanks, > > > > Thanks, Alex. Your changes look good. We needn't touch MSI stuff and > it's not necessarily specific to IPR adapter :-) We still have a problem, get_cached_msi_msg() and write_msi_msg() are not exported for module use, so we get undefined symbols trying to use them. Please get them exported first or figure out a different strategy. Thanks, Alex > >Author: Gavin Shan <shangw@xxxxxxxxxxxxxxxxxx> > > > > vfio/pci: Restore MSIx message prior to enabling > > > > The MSIx vector table lives in device memory, which may be cleared as > > part of a backdoor device reset. This is the case on the IBM IPR HBA > > when the BIST is run on the device. When assigned to a QEMU guest, > > the guest driver does a pci_save_state(), issues a BIST, then does a > > pci_restore_state(). The BIST clears the MSIx vector table, but due > > to the way interrupts are configured the pci_restore_state() does not > > restore the vector table as expected. Eventually this results in an > > EEH error on Power platforms when the device attempts to signal an > > interrupt with the zero'd table entry. > > > > Fix the problem by restoring the host cached MSI message prior to > > enabling each vector. > > > > Reported-by: Wen Xiong <wenxiong@xxxxxxxxxx> > > Signed-off-by: Gavin Shan <shangw@xxxxxxxxxxxxxxxxxx> > > Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx> > > > >diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c > >index 2103576..3058b2e 100644 > >--- a/drivers/vfio/pci/vfio_pci_intrs.c > >+++ b/drivers/vfio/pci/vfio_pci_intrs.c > >@@ -17,6 +17,7 @@ > > #include <linux/interrupt.h> > > #include <linux/eventfd.h> > > #include <linux/pci.h> > >+#include <linux/msi.h> > > #include <linux/file.h> > > #include <linux/poll.h> > > #include <linux/vfio.h> > >@@ -544,6 +545,19 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev, > > return PTR_ERR(trigger); > > } > > > >+ /* > >+ * The MSIx vector table resides in device memory which may be cleared > >+ * via backdoor resets. We don't allow direct access to the vector > >+ * table so even if a userspace driver attempts to save/restore around > >+ * such a reset it would be unsuccessful. To avoid this, restore the > >+ * cached value of the message prior to enabling. > >+ */ > >+ if (msix) { > >+ struct msi_msg msg; > >+ get_cached_msi_msg(irq, &msg); > >+ write_msi_msg(irq, &msg); > >+ } > >+ > > ret = request_irq(irq, vfio_msihandler, 0, > > vdev->ctx[vector].name, trigger); > > if (ret) { > > > > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html