On Thu, 16 Mar 2023 23:29:21 +0000 "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > > From: Alex Williamson > > Sent: Friday, March 17, 2023 2:46 AM > > > > On Wed, 15 Mar 2023 23:31:23 +0000 > > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > > > > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > > > Sent: Thursday, March 16, 2023 6:53 AM > > > > I'm afraid this proposal reduces or eliminates the handshake we have > > > > with userspace between VFIO_DEVICE_GET_PCI_HOT_RESET_INFO and > > > > VFIO_DEVICE_PCI_HOT_RESET, which could promote userspace to ignore > > the > > > > _INFO ioctl altogether, resulting in drivers that don't understand the > > > > scope of the reset. Is it worth it? What do we really gain? > > > > > > Jason raised the concern whether GET_PCI_HOT_RESET_INFO is actually > > > useful today. > > > > > > It's an interface on opened device. So the tiny difference is whether the > > > user knows the device is resettable when calling GET_INFO or later when > > > actually calling PCI_HOT_RESET. > > > > No, GET_PCI_HOT_RESET_INFO conveys not only whether a PCI_HOT_RESET > > can > > be performed, but equally important the scope of the reset, ie. which > > devices are affected by the reset. If we de-emphasize the INFO > > portion, then this easily gets confused as just a variant of > > VFIO_DEVICE_RESET, which is explicitly a device-level cscope reset. In > > fact, I'd say the interface is not only trying to validate that the > > user has sufficient privileges for the reset, but that they explicitly > > acknowledge the scope of the reset. > > > > IMHO the usefulness of scope is if it's discoverable by the management > stack which then can try to assign devices with affected reset to a same > user. Disagree, the user needs to know the scope of reset. Take for instance two function of a device configured onto separate buses within a VM. The VMM needs to know that a hot-reset of one will reset the other. That's not obvious to the VMM without some understanding of PCI/e topology and analysis of the host system. The info ioctl simplifies that discovery for the VMM and the handshake of passing the affected groups makes sure that the info ioctl remains relevant. OTOH, I really haven't seen any evidence that the null-array concept provides significant simplification for userspace, especially without compromising the user's understanding of the scope of the provided reset. Thanks, Alex