Re: [PATCH v1 0/2] Add ablility of VFIO driver to ignore reset when device don't need it

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



OK.  Thank you.  Let's waitting for NVIDIA's solution.

Alex Williamson <alex.williamson@xxxxxxxxxx> 于2021年10月14日周四 下午8:48写道:
>
> On Thu, 14 Oct 2021 17:57:46 +0800
> Zhenguo Yao <yaozhenguo1@xxxxxxxxx> wrote:
>
> > In some scenarios, vfio device can't do any reset in initialization
> > process. For example: Nvswitch and GPU A100 working in Shared NVSwitch
> > Virtualization Model. In such mode, there are two type VMs: service
> > VM and Guest VM. The GPU devices are initialized in the following steps:
> >
> > 1. Service VM boot up. GPUs and Nvswitchs are passthrough to service VM.
> > Nvidia driver and manager software will do some settings in service VM.
> >
> > 2. The selected GPUs are unpluged from service VM.
> >
> > 3. Guest VM boots up with the selected GPUs passthrough.
> >
> > The selected GPUs can't do any reset in step3, or they will be initialized
> > failed in Guest VM.
> >
> > This patchset add a PCI sysfs interface:ignore_reset which drivers can
> > use it to control whether to do PCI reset or not. For example: In Shared
> > NVSwitch Virtualization Model. Hypervisor can disable PCI reset by setting
> > ignore_reset to 1 before Gust VM booting up.
> >
> > Zhenguo Yao (2):
> >   PCI: Add ignore_reset sysfs interface to control whether do device
> >     reset in PCI drivers
> >   vfio-pci: Don't do device reset when ignore_reset is setting
> >
> >  drivers/pci/pci-sysfs.c          | 25 +++++++++++++++++
> >  drivers/vfio/pci/vfio_pci_core.c | 48 ++++++++++++++++++++------------
> >  include/linux/pci.h              |  1 +
> >  3 files changed, 56 insertions(+), 18 deletions(-)
> >
>
> This all seems like code to mask that these NVSwitch configurations are
> probably insecure because we can't factor and manage NVSwitch isolation
> into IOMMU grouping.  I'm guessing this "service VM" pokes proprietary
> registers to manage that isolation and perhaps later resetting devices
> negates that programming.  A more proper solution is probably to do our
> best to guess the span of an NVSwitch configuration and make the IOMMU
> group include all the devices, until NVIDIA provides proper code for
> the kernel to understand this interconnect and how it affects DMA
> isolation.  Nak on disabling resets for the purpose of preventing a
> user from undoing proprietary device programming.  Thanks,
>
> Alex
>




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux