Re: [PATCH 3/3] drivers/vfio/pci: Fix MSIx message lost

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2014-03-03 at 14:10 +0800, Gavin Shan wrote:
> On Sun, Mar 02, 2014 at 09:49:49PM -0700, Alex Williamson wrote:
> >On Mon, 2014-03-03 at 14:51 +1100, Benjamin Herrenschmidt wrote:
> >> On Mon, 2014-03-03 at 11:24 +0800, Gavin Shan wrote:
> 
> .../...
> 
> >> 
> >> > Reported-by: Wen Xiong <wenxiong@xxxxxxxxxx>
> >> > Signed-off-by: Gavin Shan <shangw@xxxxxxxxxxxxxxxxxx>
> >> > ---
> >> >  drivers/vfio/pci/vfio_pci_intrs.c |   19 +++++++++++++++++++
> >> >  1 file changed, 19 insertions(+)
> >> > 
> >> > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
> >> > index 2103576..279ebd0 100644
> >> > --- a/drivers/vfio/pci/vfio_pci_intrs.c
> >> > +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> >> > @@ -17,6 +17,7 @@
> >> >  #include <linux/interrupt.h>
> >> >  #include <linux/eventfd.h>
> >> >  #include <linux/pci.h>
> >> > +#include <linux/msi.h>
> >> >  #include <linux/file.h>
> >> >  #include <linux/poll.h>
> >> >  #include <linux/vfio.h>
> >> > @@ -517,6 +518,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
> >> >  	struct pci_dev *pdev = vdev->pdev;
> >> >  	int irq = msix ? vdev->msix[vector].vector : pdev->irq + vector;
> >> >  	char *name = msix ? "vfio-msix" : "vfio-msi";
> >> > +	struct msi_msg msg;
> >> >  	struct eventfd_ctx *trigger;
> >> >  	int ret;
> >> >  
> >> > @@ -544,6 +546,23 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
> >> >  		return PTR_ERR(trigger);
> >> >  	}
> >> >  
> >> > +	/* We possiblly lose the MSI/MSIx message in some cases.
> >> > +	 * For example, BIST reset on IPR adapter. The MSIx table
> >> > +	 * is cleaned out. However, we never get chance to put
> >> > +	 * MSIx messages to MSIx table because all MSIx stuff is
> >> > +	 * being cached in QEMU. Here, we had the trick to put the
> >> > +	 * MSI/MSIx message back.
> >> > +	 *
> >> > +	 * Basically, we needn't worry about MSI messages. However,
> >> > +	 * it's not harmful and there might be cases of PCI config data
> >> > +	 * lost because of cached PCI config data in QEMU again.
> >> > +	 *
> >> > +	 * Note that we should flash the message prior to enabling
> >> > +	 * the corresponding interrupt by request_irq().
> >> > +	 */
> >> > +	 get_cached_msi_msg(irq, &msg);
> >> > +	 write_msi_msg(irq, &msg);
> >> > +
> >> >  	ret = request_irq(irq, vfio_msihandler, 0,
> >> >  			  vdev->ctx[vector].name, trigger);
> >> >  	if (ret) {
> >
> >
> >I understand from talking to Alexey that the BARs are reset by this
> >BIST, but what about the MSIX capability?  If that gets reset to be
> >disabled, where does it get re-enabled?  QEMU won't know about the BIST,
> >so probably assumes nothing changed when the guest writes the enable
> >bit.  VFIO doesn't allow userspace writes to the MSIX capability.  So it
> >seems like the above would restore the table entry, but not necessarily
> >whether MSIX is enabled on the device.
> >
> 
> Yes, It's not necessarily, but not harmful to restore the message into
> device even though the PCI device has individual enabled. Prior to
> request_irq(), the interrupt should have been masked on PIC/CPU side or
> race condition there.
> 
> >When I talked with Alexey I noted that we already have a BAR restore
> >function, vfio_bar_restore(), that tries to do some fixup when backdoor
> >resets are noticed.  If we were to trigger that upon noting the user
> >writing BAR values to what we think they're already set to, we could
> >extend it to trigger an interrupt restore that could fixup both the
> >capability and the table entries.  Thanks,
> >
> 
> Various PCI devices could have different ways to do BIST. It's a bit
> hard to capture all of them, I think. Alex, please advise which way
> to fix the issue would be better. Or we can have this patch and figure
> out the way using vfio_bar_restore() later? :-)

This approach arguably has a very niche application.  I'd rather see
this device fall into the vfio_bar_restore() and add a callout to the
vfio irq code to restore the MSIX table and MSI/MSIX capabilities if the
surprise reset occurred with either enabled.  That seems like it would
be generally useful to any device with a backdoor reset.  Besides, if
the BARs on the device are being deprogrammed by BIST, then it must
already be hitting the vfio_bar_restore code or just re-writing the
vector table entry wouldn't be enough to get it running again.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux