On Thu, 14 Oct 2021 17:57:46 +0800 Zhenguo Yao <yaozhenguo1@xxxxxxxxx> wrote: > In some scenarios, vfio device can't do any reset in initialization > process. For example: Nvswitch and GPU A100 working in Shared NVSwitch > Virtualization Model. In such mode, there are two type VMs: service > VM and Guest VM. The GPU devices are initialized in the following steps: > > 1. Service VM boot up. GPUs and Nvswitchs are passthrough to service VM. > Nvidia driver and manager software will do some settings in service VM. > > 2. The selected GPUs are unpluged from service VM. > > 3. Guest VM boots up with the selected GPUs passthrough. > > The selected GPUs can't do any reset in step3, or they will be initialized > failed in Guest VM. > > This patchset add a PCI sysfs interface:ignore_reset which drivers can > use it to control whether to do PCI reset or not. For example: In Shared > NVSwitch Virtualization Model. Hypervisor can disable PCI reset by setting > ignore_reset to 1 before Gust VM booting up. > > Zhenguo Yao (2): > PCI: Add ignore_reset sysfs interface to control whether do device > reset in PCI drivers > vfio-pci: Don't do device reset when ignore_reset is setting > > drivers/pci/pci-sysfs.c | 25 +++++++++++++++++ > drivers/vfio/pci/vfio_pci_core.c | 48 ++++++++++++++++++++------------ > include/linux/pci.h | 1 + > 3 files changed, 56 insertions(+), 18 deletions(-) > This all seems like code to mask that these NVSwitch configurations are probably insecure because we can't factor and manage NVSwitch isolation into IOMMU grouping. I'm guessing this "service VM" pokes proprietary registers to manage that isolation and perhaps later resetting devices negates that programming. A more proper solution is probably to do our best to guess the span of an NVSwitch configuration and make the IOMMU group include all the devices, until NVIDIA provides proper code for the kernel to understand this interconnect and how it affects DMA isolation. Nak on disabling resets for the purpose of preventing a user from undoing proprietary device programming. Thanks, Alex