Sorry for late. On 03/14/2017 06:06 AM, Alex Williamson wrote: > On Mon, 27 Feb 2017 15:28:43 +0800 > Cao jin <caoj.fnst@xxxxxxxxxxxxxx> wrote: > >> 0. What happens now (PCIE AER only) >> Fatal errors cause a link reset. >> Non fatal errors don't. >> All errors stop the VM eventually, but not immediately >> because it's detected and reported asynchronously. >> Interrupts are forwarded as usual. >> Correctable errors are not reported to guest at all. >> Note: PPC EEH is different. This focuses on AER. > > Perhaps you're only focusing on AER, but don't the error handlers we're > using support both AER and EEH generically? I don't think we can > completely disregard how this affects EEH behavior, if at all. > After taking a rough look at the EEH, find that EEH always feed error_detected with pci_channel_io_frozen, from perspective of error_detected, EEH is not affected. I am not sure about a question: when assign devices in spapr host, should all functions/devices in a PE be bound to vfio? I am kind of confused about the relationship between a PE & a tce iommu group >> >> 1. Correctable errors >> There is no need to report these to guest. So let's not. > > What does this patch change to make this happen? I don't see > anything. Was this always the case? No change? > yes, no change on correctable error. >> >> 2. Fatal errors >> It's not easy to handle them gracefully since link reset >> is needed. As a first step, let's use the existing mechanism >> in that case. > > Ok, so no change here either. > >> 2. Non-fatal errors >> Here we could make progress by reporting them to guest >> and have guest handle them. > > In practice, what actual errors do we expect userspace to see as > non-fatal errors? It would be useful for the commit log to describe > the actual benefit we're going to see by splitting out non-fatal errors > for the user (not always a guest) to see separately. Justify that this > is actually useful. > >> >> Issues: >> a. this behaviour should only be enabled with new userspace, >> old userspace should work without changes. >> >> Suggestion: One way to address this would be to add a new eventfd >> non_fatal_err_trigger. If not set, invoke err_trigger. > > This outline format was really more useful for Michael to try to > generate discussion, for a commit log, I'd much rather see a definitive > statement such as: > > "To maintain backwards compatibility with userspace, non-fatal errors > will continue to trigger via the existing error interrupt index if a > non-fatal signaling mechanism has not been registered." > >> b. drivers are supposed to stop MMIO when error is reported, >> if vm keeps going, we will keep doing MMIO/config. >> >> Suggestion 1: ignore this. vm stop happens much later when >> userspace runs anyway, so we are not making things much worse. >> >> Suggestion 2: try to stop MMIO/config, resume on resume call >> >> Patch below implements Suggestion 1. >> >> Note that although this is really against the documentation, which >> states error_detected() is the point at which the driver should quiesce >> the device and not touch it further (until diagnostic poking at >> mmio_enabled or full access at resume callback). >> >> Fixing this won't be easy. However, this is not a regression. >> >> Also note this does nothing about interrupts, documentation >> suggests returning IRQ_NONE until reset. >> Again, not a regression. > > So again, no change here. I'm not sure what this adds to the commit > log, perhaps we can reference this as a link to Michael's original > proposal. > >> c. PF driver might detect that function is completely broken, >> if vm keeps going, we will keep doing MMIO/config. >> >> Suggestion 1: ignore this. vm stop happens much later when >> userspace runs anyway, so we are not making things much worse. >> >> Suggestion 2: detect this and invoke err_trigger to stop VM. >> >> Patch below implements Suggestion 2. > > This needs more description and seems a bit misleading. This patch > adds a slot_reset handler, such that if the slot is reset, we notify > the user, essentially promoting the non-fatal error to fatal. But what > condition gets us to this point? AIUI, AER is a voting scheme and if > any driver affected says they need a reset, everyone gets a reset. So > the PF driver we're talking about here is not vfio-pci and it's not the > user, the user has no way to signal that the device is completely > broken, this only handles the case of other collateral devices with > native host drivers that might signal this, right? > Yes, same understanding as you, if I don't miss something from michael. > It seems like this is where this patch has the greatest exposure to > regressions. If we take the VM use case, previously we could have a > non-AER aware guest and the hypervisor could stop the VM on all > errors. Now the hypervisor might support the distinction between fatal > and non-fatal, but the guest may still not have AER support. That > doesn't imply a problem with this approach, the user (hypervisor) would > be at fault for any difference in handling in that case. > >> >> +static pci_ers_result_t vfio_pci_aer_slot_reset(struct pci_dev *pdev) >> +{ >> + struct vfio_pci_device *vdev; >> + struct vfio_device *device; >> + static pci_ers_result_t err = PCI_ERS_RESULT_NONE; >> + >> + device = vfio_device_get_from_dev(&pdev->dev); >> + if (!device) >> + goto err_dev; >> + >> + vdev = vfio_device_data(device); >> + if (!vdev) >> + goto err_data; >> + >> + mutex_lock(&vdev->igate); >> + >> + if (vdev->err_trigger) >> + eventfd_signal(vdev->err_trigger, 1); > > What about the case where the user has not registered for receiving > non-fatal errors, now we send an error signal on both error_detected > and slot_reset. Is that useful/desirable? > Not desirable, but seems not harmful, guest user will stop anyway. How to avoid this case gracefully seems not easy. -- Sincerely, Cao jin