> From: Tian, Kevin <kevin.tian@xxxxxxxxx> > Sent: Friday, April 14, 2023 5:12 PM > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > Sent: Friday, April 14, 2023 2:07 AM > > > > We had already iterated a proposal where the group-id is replaced with > > a dev-id in the existing ioctl and a flag indicates when the return > > value is a dev-id vs group-id. This had a gap that userspace cannot > > determine if a reset is available given this information since un-owned > > devices report an invalid dev-id and userspace can't know if it has > > implicit ownership. > > > > > It seems cleaner to me though that we would could still re-use INFO in > > a similar way, simply defining a new flag bit which is valid only in > > the case of returning dev-ids and indicates if the reset is available. > > Therefore in one ioctl, userspace knows if hot-reset is available > > (based on a kernel determination) and can pull valid dev-ids from the Need to confirm the meaning of hot-reset available flag. I think it should at least meet below two conditions to set this flag. Although it may not mean hot-reset is for sure to succeed. (but should be a high chance). 1) dev_set is resettable (all affected device are in dev_set) 2) affected device are owned by the current user Also, we need to has assumption that below two cases are rare if user encounters it, it just bad luck for them. I think the existing _INFO and hot-reset already has such assumption. So cdev mode can adopt it as well. a) physical topology change (e.g. new devices plugged to affected slot) b) an affected device is unbound from vfio > So the kernel needs to compare the group id between devices with > valid dev-ids and devices with invalid dev-ids to decide the implicit > ownership. For noiommu device which has no group_id when > VFIO_GROUP is off then it's resettable only if having a valid dev_id. In cdev mode, noiommu device doesn't have dev_id as it is not bound to valid iommufd. So if VFIO_GROUP is off, we may never allow hot-reset for noiommu devices. But we don't want to have regression with noiommu devices. Perhaps we may define the usage of the resettable flag like this: 1) if it is set, user does not need to own all the affected devices as some of them may have been owned implicitly. Kernel should have checked it. 2) if the flag is not set, that means user needs to check ownership by itself. It needs to own all the affected devices. If not, don't do hot-reset. This way we can still make noiommu devices support hot-reset just like VFIO_GROUP is on. Because noiommu devices have fake groups, such groups are all singleton. So checking all affected devices are opened by user is just same as check all affected groups. > The only corner case with this option is when a user mixes group > and cdev usages. iirc you mentioned it's a valid usage to be supported. > In that case the kernel doesn't have sufficient knowledge to judge > 'resettable' as it doesn't know which groups are opened by this user. > > Not sure whether we can leave it in a ugly way so INFO may not tell > 'resettable' accurately in that weird scenario. This seems not easy to support. If above scenario is allowed there can be three cases that returns invalid dev_id. 1) devices not opened by user but owned implicitly 2) devices not owned by user 3) devices opened via group but owned by user User would require more info to tell the above cases from each other. > > array to associate affected, owned devices, and still has the > > equivalent information to know that one or more of the devices listed > > with an invalid dev-id are preventing the hot-reset from being > > available. > > > > Is that an option? Thanks, > > > > This works for me if above corner case can be waived. One side check, perhaps already confirmed in prior email. @Alex, So the reason for the prediction of hot-reset is to avoid the possible vfio_pci_pre_reset() which does heavy operations like stop DMA and copy config space. Is it? Any other special reason? Anyhow, this reason is enough for this prediction per my understanding. Regards, Yi Liu