Re: [PATCH 4/4] vfio/pci: Restore MSIx message prior to enabling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 02, 2014 at 10:57:05AM -0600, Bjorn Helgaas wrote:
>On Sat, May 31, 2014 at 5:42 AM, Gavin Shan <gwshan@xxxxxxxxxxxxxxxxxx> wrote:
>> On Fri, May 30, 2014 at 04:12:32PM -0600, Bjorn Helgaas wrote:
>>>On Mon, May 19, 2014 at 01:01:10PM +1000, Gavin Shan wrote:

.../...

[ Remove the confusing description ]

>It sounds like QEMU assumes the MSIx entries can't be changed by
>anything other than the writes it traps.  This assumption is false
>(the entries are cleared when the driver resets the device, and QEMU
>doesn't know about the reset).
>

If I'm correct enough, QEMU disallows access to MSIx table in HW.
Access is captured by QEMU and terminated there for most of cases.
MSIx message can't be written to HW.
 
>Why can't QEMU trap the write from pci_restore_state() and update the
>hardware, even if it thinks nothing has changed?
>

For MSIx messages, pci_restore_start() restores what the device got
from QEMU. I think the MSIx message isn't expected one by HW (more
details below).

Sorry, Bjorn. I think my last reply should have confused you as that's not
correct. The problem and tentative fix has been there for a some time.
I almost forgot the details. I rechecked the discussion about the topic.
It's not what I described in last reply:

http://comments.gmane.org/gmane.comp.emulators.kvm.devel/119689

Let me correct it like this. Alex.W in the cc list is the VFIO expert.
I might have something wrong about VFIO and Alex could help correcting :-)

1) Guest: PCI device works fine in guest
2) QEMU:  MSIx entry cache (unmasked). It seems the MSIx message maintained
by QEMU is figured out by itself and inconsistent with HW (host kernel). It's
separate (potential) issue. So QEMU and host don't exchange MSIx message with
each other.
3) Guest: PCI device driver calls pci_save_state(), issue reset,
pci_restore_state().
4) QEMU got trapped and notify VFIO PCI device to start the MSIx interrupt,
which is done by ioctl() to VFIO PCI device on host side. It seems that VFIO
device driver does request_irq() and setup irqfd stuff so that the interrupt
can be propagated to QEMU.

The problem is that we got MSIx message lost, which was caused by the
reset. Unfortunately, no one tried retoring the message to hardware.
Eventually, the PCI device sends DMA (for MSIx interrupt) traffic with
0x0's address/data, which isn't allowed on Power platform and causes
EEH error.

Since MSIx message QEMU and host owes are different and QEMU is having
invalid message, so it's not making sense to update hardware with QEMU's
cached message. On the other hand, the message data should be restored
to HW by somebody and the senario is related to VFIO PCI. It sounds
fair to have VFIO PCI driver resotres the message as we did. As you said,
it's ugly for driver to write MSIx message. I'm not sure.

>From guest itself, PCI code is consistent and I don't think there has
anything we need improve for this: pci_save_state(), reset, pci_restore_state()
should work fine.

>From the host side, we probably can restore MSIx message in request_irq().
In the IRQ chip callbacks (e.g. startup, unmask), we could have overhead
to restore MSIx message. However, it's totally unnecessarily to host itself.

Hopefully, I make myself clear this time :-)

Thanks,
Gavin 

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux