Re: The results of lspci are inconsistent between vfio reset pci devices and reset devices by sysfs interafce

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> > > Hi,
> > >
> > > I start a virtual machine with commandline:
> > >     /usr/libexec/qemu-kvm --enable-kvm -smp 8 -m 8192 -device
> > > vfio-pci,host=0000:81:00.0
> > >
> > > Then I pause the qemu process before executing the main_loop
> > > function by
> > gdb.
> > > At this moment, lspci shows the regions are disabled like below:
> > >     81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100
> > > PCIe
> > 16GB] (rev a1)
> > >         Subsystem: NVIDIA Corporation Device 118f
> > >         Physical Slot: 0-6
> > >         Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-
> > ParErr- Stepping- SERR- FastB2B- DisINTx+
> > >         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
> > > >TAbort-
> > <TAbort- <MAbort- >SERR- <PERR- INTx-
> > >         Interrupt: pin A routed to IRQ 35
> > >         NUMA node: 1
> > >         Region 0: Memory at c8000000 (32-bit, non-prefetchable)
> > [disabled] [size=16M]
> > >         Region 1: Memory at 27800000000 (64-bit, prefetchable)
> > > [disabled]
> > [size=16G]
> > >         Region 3: Memory at 27c00000000 (64-bit, prefetchable)
> > > [disabled] [size=32M]
> > >
> > > But after the command:
> > > echo 1 > /sys/bus/pci/devices/0000:81:00.0/reset
> > > lspci shows the regions are *not* disabled:
> > >     81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100
> > > PCIe
> > 16GB] (rev a1)
> > >         Subsystem: Huawei Technologies Co., Ltd. Device 2061
> > >         Physical Slot: 0-6
> > >         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > ParErr+ Stepping- SERR+ FastB2B- DisINTx-
> > >         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
> > > >TAbort-
> > <TAbort- <MAbort- >SERR- <PERR- INTx-
> > >         Latency: 0, Cache Line Size: 32 bytes
> > >         Interrupt: pin A routed to IRQ 7
> > >         NUMA node: 1
> > >         Region 0: Memory at c8000000 (32-bit, non-prefetchable)
> > [size=16M]
> > >         Region 1: Memory at 27800000000 (64-bit, prefetchable)
> [size=16G]
> > >         Region 3: Memory at 27c00000000 (64-bit, prefetchable)
> > > [size=32M]
> > >
> > > AFAIK, qemu performs vfio_pci_reset like the below callstack:
> > >     Qemu:
> > >         vfio_pci_reset
> > >             ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)
> > > Kernel:
> > >     vfio_pci_ioctl
> > >         pci_try_reset_function
> > >             __pci_reset_function_locked
> > >                     pci_parent_bus_reset
> > >                         pci_reset_bridge_secondary_bus
> > >
> > > and write 1 to the reset interface of sysfs go through the path:
> > > Kernel:
> > >     reset_store
> > >         pci_reset_function
> > >             __pci_reset_function_locked
> > >                     pci_parent_bus_reset
> > >                         pci_reset_bridge_secondary_bus
> > >
> > > So seem that these two methods are same actually, I am confused why
> > > the
> > results are inconsistent.
> >
> > Maybe there's a misunderstanding here, the kernel PCI reset functions
> > save and restore config space around the reset.  The intention of the
> > reset is to re-init the internal state of the device while preserving
> > (via
> > save+restore) the config space.  The BARs being disabled is simply a
> > matter of the Memory bit in the Command register being unset (note Mem-).
> > Whether this is indicative of some issue depends on whether the state
> > before reset matches the state after reset, not that the states after
> > two different paths of triggering a reset are identical.
> >
> > vfio-pci will hand off the device to the user (QEMU) disabled, so the
> > states in the first example make sense to me.  In the second case,
> > it's not clear what the starting state is for the device.  Was this
> > reset performed from the starting point of the first case or is the
> > device in some arbitrary, unknown state prior to reset?  Thanks,
> >
> > Alex
> In the second case, the reset was performed from the starting point of the
> first case.
> IOW, the states before the two cases are identical, I think. The only
> difference I can think of is the qemu process will perform twice reset,
> one occurs when vfio open the device' fd and the other one occurs as I
> mentioned above.
> 
> Thanks,
> Wu Zongyong

You're right. The initial states are not identical.
I found the function vfio_pci_pre_reset in qemu.
    /*
     * Stop any ongoing DMA by disconecting I/O, MMIO, and bus master.
     * Also put INTx Disable in known state.
     */
    cmd = vfio_pci_read_config(pdev, PCI_COMMAND, 2);
    cmd &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER |
             PCI_COMMAND_INTX_DISABLE);
    vfio_pci_write_config(pdev, PCI_COMMAND, cmd, 2);

So the behaviors between the two reset are inconsistent.

Then I wonder whether the operation is necessary here?
Could I enable the Memory bit in the Command register in vfio_pci_post_reset,
because I want to write regions of PCI devices after reset.

Thanks,
Wu Zongyong





--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list



[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux