> From: Alex Williamson <alex.williamson@xxxxxxxxxx> > Sent: Wednesday, April 12, 2023 5:58 AM > > On Tue, 11 Apr 2023 15:40:07 -0300 > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > On Tue, Apr 11, 2023 at 11:11:17AM -0600, Alex Williamson wrote: > > > [Appears the list got dropped, replying to my previous message to re-add] > > > > Wowo this got mesed up alot, mutt drops the cc when replying for some > > reason. I think it is fixed up now > > > > > > Our cdev model says that opening a cdev locks out other cdevs from > > > > independent use, eg because of the group sharing. Extending this to > > > > include the reset group as well seems consistent. > > > > > > The DMA ownership model based on the IOMMU group is consistent with > > > legacy vfio, but now you're proposing a new ownership model that > > > optionally allows a user to extend their ownership, opportunistically > > > lock out other users, and wreaking havoc for management utilities that > > > also have no insight into dev_sets or userspace driver behavior. > > > > I suggested below that the owership require enough open devices - so > > it doesn't "extend ownership opportunistically", and there is no > > havoc. > > > > Management tools already need to understand dev_set if they want to > > offer reliable reset support to the VMs. Same as today. > > I don't think that's true. Our primary hot-reset use case is GPUs and > subordinate functions, where the isolation and reset scope are often > sufficiently similar to make hot-reset possible, regardless whether > all the functions are assigned to a VM. I don't think you'll find any > management tools that takes reset scope into account otherwise. If we only care about the primary case where iommu group and reset scope matches, then why would the new claim model in Jason's proposal urge the management tools to understand the reset scope now? btw in your earlier replies you pointed out the issue of unpredictable ordering on a multi-function device e.g. upon which one runs first dpdk or qmeu will block the other. But I wonder what is the actual use of allowing both running while both can't do reset due to affected reset scope in current model. If a vfio user cannot do reset doesn't it imply it hasn't acquired the full permission on the device then Jason's proposal of explicitly failing it is actually a cleaner model? Thanks Kevin