On Wed, 2018-12-19 at 08:52:13 UTC, Alexey Kardashevskiy wrote: > The skiboot firmware has a hot reset handler which fences the NVIDIA V100 > GPU RAM on Witherspoons and makes accesses no-op instead of throwing HMIs: > https://github.com/open-power/skiboot/commit/fca2b2b839a67 > > Now we are going to pass V100 via VFIO which most certainly involves > KVM guests which are often terminated without getting a chance to offline > GPU RAM so we end up with a running machine with misconfigured memory. > Accessing this memory produces hardware management interrupts (HMI) > which bring the host down. > > To suppress HMIs, this wires up this hot reset hook to vfio_pci_disable() > via pci_disable_device() which switches NPU2 to a safe mode and prevents > HMIs. > > Signed-off-by: Alexey Kardashevskiy <aik@xxxxxxxxx> > Acked-by: Alistair Popple <alistair@xxxxxxxxxxxx> > Reviewed-by: David Gibson <david@xxxxxxxxxxxxxxxxxxxxx> Series applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/ab7032e793f9ad799ca2692046fba5 cheers