On Wed, 2019-01-09 at 15:53 +1100, Alexey Kardashevskiy wrote: > "A PCI completion timeout occurred for an outstanding PCI-E transaction" > it is. > > This is how I bind the device to vfio: > > echo vfio-pci > '/sys/bus/pci/devices/0000:01:00.0/driver_override' > echo vfio-pci > '/sys/bus/pci/devices/0000:01:00.1/driver_override' > echo '0000:01:00.0' > '/sys/bus/pci/devices/0000:01:00.0/driver/unbind' > echo '0000:01:00.1' > '/sys/bus/pci/devices/0000:01:00.1/driver/unbind' > echo '0000:01:00.0' > /sys/bus/pci/drivers/vfio-pci/bind > echo '0000:01:00.1' > /sys/bus/pci/drivers/vfio-pci/bind > > > and I noticed that EEH only happens with the last command. The order > (.0,.1 or .1,.0) does not matter, it seems that putting one function to > D3 is fine but putting another one when the first one is already in D3 - > produces EEH. And I do not recall ever seeing this on the firestone > machine. Weird. Putting all functions into D3 is what allows the device to actually go into D3. Does it work with other devices ? We do have that bug on early P9 revisions where the attempt of bringing the link to L1 as part of the D3 process fails in horrible ways, I thought P8 would be ok but maybe not ... Otherwise, it might be that our timeouts are too low (you may want to talk to our PCIe guys internally) Cheers, Ben.