Hi Alex, > From: Liu, Yi L <yi.l.liu@xxxxxxxxx> > Sent: Monday, January 30, 2023 8:15 PM > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > Sent: Friday, January 20, 2023 7:52 AM > > > > On Tue, 17 Jan 2023 05:49:39 -0800 > > Yi Liu <yi.l.liu@xxxxxxxxx> wrote: > > > > > VFIO group has historically allowed multi-open of the device FD. This > > > was made secure because the "open" was executed via an ioctl to the > > > group FD which is itself only single open. > > > > > > No know use of multiple device FDs is known. It is kind of a strange > > ^^ ^^^^ ^^^^^ > > How about "No known use of multiple device FDs today" > > > > thing to do because new device FDs can naturally be created via dup(). > > > > > > When we implement the new device uAPI there is no natural way to > allow > > > the device itself from being multi-opened in a secure manner. Without > > > the group FD we cannot prove the security context of the opener. > > > > > > Thus, when moving to the new uAPI we block the ability to multi-open > > > the device. This also makes the cdev path exclusive with group path. > > > > > > The main logic is in the vfio_device_open(). It needs to sustain both > > > the legacy behavior i.e. multi-open in the group path and the new > > > behavior i.e. single-open in the cdev path. This mixture leads to the > > > introduction of a new single_open flag stored both in struct vfio_device > > > and vfio_device_file. vfio_device_file::single_open is set per the > > > vfio_device_file allocation. Its value is propagated to struct vfio_device > > > after device is opened successfully. > > > > > > Signed-off-by: Yi Liu <yi.l.liu@xxxxxxxxx> > > > --- > > > drivers/vfio/group.c | 2 +- > > > drivers/vfio/vfio.h | 6 +++++- > > > drivers/vfio/vfio_main.c | 25 ++++++++++++++++++++++--- > > > include/linux/vfio.h | 1 + > > > 4 files changed, 29 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c > > > index 9484bb1c54a9..57ebe5e1a7e6 100644 > > > --- a/drivers/vfio/group.c > > > +++ b/drivers/vfio/group.c > > > @@ -216,7 +216,7 @@ static struct file *vfio_device_open_file(struct > > vfio_device *device) > > > struct file *filep; > > > int ret; > > > > > > - df = vfio_allocate_device_file(device); > > > + df = vfio_allocate_device_file(device, false); > > > if (IS_ERR(df)) { > > > ret = PTR_ERR(df); > > > goto err_out; > > > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h > > > index fe0fcfa78710..bdcf9762521d 100644 > > > --- a/drivers/vfio/vfio.h > > > +++ b/drivers/vfio/vfio.h > > > @@ -17,7 +17,11 @@ struct vfio_device; > > > struct vfio_container; > > > > > > struct vfio_device_file { > > > + /* static fields, init per allocation */ > > > struct vfio_device *device; > > > + bool single_open; > > > + > > > + /* fields set after allocation */ > > > struct kvm *kvm; > > > struct iommufd_ctx *iommufd; > > > bool access_granted; > > > @@ -30,7 +34,7 @@ int vfio_device_open(struct vfio_device_file *df, > > > void vfio_device_close(struct vfio_device_file *device); > > > > > > struct vfio_device_file * > > > -vfio_allocate_device_file(struct vfio_device *device); > > > +vfio_allocate_device_file(struct vfio_device *device, bool > single_open); > > > > > > extern const struct file_operations vfio_device_fops; > > > > > > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c > > > index 90174a9015c4..78725c28b933 100644 > > > --- a/drivers/vfio/vfio_main.c > > > +++ b/drivers/vfio/vfio_main.c > > > @@ -345,7 +345,7 @@ static bool vfio_assert_device_open(struct > > vfio_device *device) > > > } > > > > > > struct vfio_device_file * > > > -vfio_allocate_device_file(struct vfio_device *device) > > > +vfio_allocate_device_file(struct vfio_device *device, bool single_open) > > > { > > > struct vfio_device_file *df; > > > > > > @@ -354,6 +354,7 @@ vfio_allocate_device_file(struct vfio_device > > *device) > > > return ERR_PTR(-ENOMEM); > > > > > > df->device = device; > > > + df->single_open = single_open; > > > > It doesn't make sense to me to convolute the definition of this > > function with an unmemorable bool arg when the one caller that sets the > > value true could simply open code it. > > Yeah, how about renaming it just like Kevin's suggestion? > > https://lore.kernel.org/kvm/BN9PR11MB52769CBCA68CD25DAC96B33B8CC > 49@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > > > > > > > > return df; > > > } > > > @@ -421,6 +422,16 @@ int vfio_device_open(struct vfio_device_file > *df, > > > > > > lockdep_assert_held(&device->dev_set->lock); > > > > > > + /* > > > + * Device cdev path cannot support multiple device open since > > > + * it doesn't have a secure way for it. So a second device > > > + * open attempt should be failed if the caller is from a cdev > > > + * path or the device has already been opened by a cdev path. > > > + */ > > > + if (device->open_count != 0 && > > > + (df->single_open || device->single_open)) > > > + return -EINVAL; > > > > IIUC, the reason this exists is that we let the user open the device > > cdev arbitrarily, but only one instance can call > > ioctl(VFIO_DEVICE_BIND_IOMMUFD). Why do we bother to let the user > > create those other file instances? What expectations are we setting > > for the user by allowing them to open the device but not use it? > > It won't be able to access device as such device fd is not bound to > an iommufd. > > > Clearly we're thinking about a case here where the device has been > > opened via the group path and the user is now attempting to bind the > > same device via the cdev path. > > This shall fail as the group path would inc the device->open_count. Then > the cdev path will be failed as the path would have df->single_open==true. > > > That seems wrong to even allow and I'm > > surprised it gets this far. In fact, where do we block a user from > > opening one device in a group via cdev and another via the group? > > such scenario would be failed by the DMA owner. > > The two paths would be excluded when claiming DMA ownership in > such scenario. The group path uses the vfio_group pointer as DMA > owner marker. While the cdev path uses the iommufd_ctx pointer. > But one group only allows one DMA owner. However, there is one possibility that the group path and cdev path have the same DMA marker. If the group path is the vfio iommufd compat mode, iommufd is used as container fd, and its DMA marker is also iommufd_ctx pointer, so it is possible that two devices in the the same group may be opened by different paths (the vfio compat mode group path and the cdev path). This seems to be ok. The group path will attach the group to an auto-allocated iommu_domain, while the cdev path actually waits for userspace to attach it to an IOAS. Userspace should take care of it. It should ensure the devices in the same group should be attached to the same domain. This seems to be no much difference with userspace opening two devices (from the same group) via cdev, userspace needs to attach them to the same domain as well. Also, there is a work from Nicolin to support domain replacement. So even the group path has attached the group to an iommu_domain, userspace can later replace it with the domain it desired (maybe still an auto-allocated domain during the IOAS attachment or a userspace allocated domain this is in my nesting series https://github.com/yiliu1765/iommufd/commits/wip/iommufd-v6.2-rc5-nesting). So for a VFIO device, we use the single_open flag to exclude opening from the group path and cdev path. While for different devices in the same group, normally userspace may not open devices in different paths. Even it does, seems to be fine as above statement. Regards, Yi Liu