On Fri, 8 Jun 2018 01:19:07 +1000 Alexey Kardashevskiy <aik@xxxxxxxxx> wrote: > Hi Alex, > > I got a dell x86 machine with Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz and I > am passing one of two EHCI hosts - 00:1a.0 - to a guest, and at first it > worked but now it would not start the guest at all and all I see in dmesg > related to the thing is: > > [ 338.331640] vfio-pci 0000:00:1a.0: enabling device (0000 -> 0002) > [ 338.433717] vfio_cap_init: 0000:00:1a.0 hiding cap 0xa > [ 352.296443] perf: interrupt took too long (8678 > 8277), lowering > kernel.perf_event_max_sample_rate to 23000 > [ 381.438215] perf: interrupt took too long (11304 > 10847), lowering > kernel.perf_event_max_sample_rate to 17000 > [ 417.441806] DMAR: DRHD: handling fault status reg 3 > [ 417.441813] DMAR: [DMA Read] Request device [00:1a.0] fault addr eb000 > [fault reason 06] PTE Read access is not set > > > Does this look any familiar? Thanks. Chances are the fault address falls within an RMRR, which is a VT-d specific abomination onto IOMMUs. The short version, aiui, is that the onboard USB controller provides PS/2 mouse and keyboard emulation for legacy OSes via memory ranges configured by the BIOS and the RMRR is used to request that the range be identity mapped for the device when the IOMMU is enabled. As Linux is not a legacy OS, we ignore the RMRR when the device is assigned, but it's still not uncommon to get these stray reads that are usually benign. As such, it's probably not causing the issue. More likely, given the works-once behavior, is that we probably have no means to reset the device between uses and someone left it in a state that we're not recovering from. For root complex integrated endpoints, if there's no FLR or even PM reset available, you're out of luck unless you can come up with a device specific reset. With a plugin card, we'd probably at least be able to perform a secondary bus reset. Thanks, Alex