> From: Alex Williamson <alex.williamson@xxxxxxxxxx> > Sent: Friday, January 20, 2023 7:52 AM > > On Tue, 17 Jan 2023 05:49:39 -0800 > Yi Liu <yi.l.liu@xxxxxxxxx> wrote: > > > VFIO group has historically allowed multi-open of the device FD. This > > was made secure because the "open" was executed via an ioctl to the > > group FD which is itself only single open. > > > > No know use of multiple device FDs is known. It is kind of a strange > ^^ ^^^^ ^^^^^ How about "No known use of multiple device FDs today" > > thing to do because new device FDs can naturally be created via dup(). > > > > When we implement the new device uAPI there is no natural way to allow > > the device itself from being multi-opened in a secure manner. Without > > the group FD we cannot prove the security context of the opener. > > > > Thus, when moving to the new uAPI we block the ability to multi-open > > the device. This also makes the cdev path exclusive with group path. > > > > The main logic is in the vfio_device_open(). It needs to sustain both > > the legacy behavior i.e. multi-open in the group path and the new > > behavior i.e. single-open in the cdev path. This mixture leads to the > > introduction of a new single_open flag stored both in struct vfio_device > > and vfio_device_file. vfio_device_file::single_open is set per the > > vfio_device_file allocation. Its value is propagated to struct vfio_device > > after device is opened successfully. > > > > Signed-off-by: Yi Liu <yi.l.liu@xxxxxxxxx> > > --- > > drivers/vfio/group.c | 2 +- > > drivers/vfio/vfio.h | 6 +++++- > > drivers/vfio/vfio_main.c | 25 ++++++++++++++++++++++--- > > include/linux/vfio.h | 1 + > > 4 files changed, 29 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c > > index 9484bb1c54a9..57ebe5e1a7e6 100644 > > --- a/drivers/vfio/group.c > > +++ b/drivers/vfio/group.c > > @@ -216,7 +216,7 @@ static struct file *vfio_device_open_file(struct > vfio_device *device) > > struct file *filep; > > int ret; > > > > - df = vfio_allocate_device_file(device); > > + df = vfio_allocate_device_file(device, false); > > if (IS_ERR(df)) { > > ret = PTR_ERR(df); > > goto err_out; > > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h > > index fe0fcfa78710..bdcf9762521d 100644 > > --- a/drivers/vfio/vfio.h > > +++ b/drivers/vfio/vfio.h > > @@ -17,7 +17,11 @@ struct vfio_device; > > struct vfio_container; > > > > struct vfio_device_file { > > + /* static fields, init per allocation */ > > struct vfio_device *device; > > + bool single_open; > > + > > + /* fields set after allocation */ > > struct kvm *kvm; > > struct iommufd_ctx *iommufd; > > bool access_granted; > > @@ -30,7 +34,7 @@ int vfio_device_open(struct vfio_device_file *df, > > void vfio_device_close(struct vfio_device_file *device); > > > > struct vfio_device_file * > > -vfio_allocate_device_file(struct vfio_device *device); > > +vfio_allocate_device_file(struct vfio_device *device, bool single_open); > > > > extern const struct file_operations vfio_device_fops; > > > > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c > > index 90174a9015c4..78725c28b933 100644 > > --- a/drivers/vfio/vfio_main.c > > +++ b/drivers/vfio/vfio_main.c > > @@ -345,7 +345,7 @@ static bool vfio_assert_device_open(struct > vfio_device *device) > > } > > > > struct vfio_device_file * > > -vfio_allocate_device_file(struct vfio_device *device) > > +vfio_allocate_device_file(struct vfio_device *device, bool single_open) > > { > > struct vfio_device_file *df; > > > > @@ -354,6 +354,7 @@ vfio_allocate_device_file(struct vfio_device > *device) > > return ERR_PTR(-ENOMEM); > > > > df->device = device; > > + df->single_open = single_open; > > It doesn't make sense to me to convolute the definition of this > function with an unmemorable bool arg when the one caller that sets the > value true could simply open code it. Yeah, how about renaming it just like Kevin's suggestion? https://lore.kernel.org/kvm/BN9PR11MB52769CBCA68CD25DAC96B33B8CC49@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > > > > return df; > > } > > @@ -421,6 +422,16 @@ int vfio_device_open(struct vfio_device_file *df, > > > > lockdep_assert_held(&device->dev_set->lock); > > > > + /* > > + * Device cdev path cannot support multiple device open since > > + * it doesn't have a secure way for it. So a second device > > + * open attempt should be failed if the caller is from a cdev > > + * path or the device has already been opened by a cdev path. > > + */ > > + if (device->open_count != 0 && > > + (df->single_open || device->single_open)) > > + return -EINVAL; > > IIUC, the reason this exists is that we let the user open the device > cdev arbitrarily, but only one instance can call > ioctl(VFIO_DEVICE_BIND_IOMMUFD). Why do we bother to let the user > create those other file instances? What expectations are we setting > for the user by allowing them to open the device but not use it? It won't be able to access device as such device fd is not bound to an iommufd. > Clearly we're thinking about a case here where the device has been > opened via the group path and the user is now attempting to bind the > same device via the cdev path. This shall fail as the group path would inc the device->open_count. Then the cdev path will be failed as the path would have df->single_open==true. > That seems wrong to even allow and I'm > surprised it gets this far. In fact, where do we block a user from > opening one device in a group via cdev and another via the group? such scenario would be failed by the DMA owner. The two paths would be excluded when claiming DMA ownership in such scenario. The group path uses the vfio_group pointer as DMA owner marker. While the cdev path uses the iommufd_ctx pointer. But one group only allows one DMA owner. > > > + > > device->open_count++; > > if (device->open_count == 1) { > > int ret; > > @@ -430,6 +441,7 @@ int vfio_device_open(struct vfio_device_file *df, > > device->open_count--; > > return ret; > > } > > + device->single_open = df->single_open; > > } > > > > /* > > @@ -446,8 +458,10 @@ void vfio_device_close(struct vfio_device_file *df) > > > > mutex_lock(&device->dev_set->lock); > > vfio_assert_device_open(device); > > - if (device->open_count == 1) > > + if (device->open_count == 1) { > > vfio_device_last_close(df); > > + device->single_open = false; > > + } > > device->open_count--; > > mutex_unlock(&device->dev_set->lock); > > } > > @@ -493,7 +507,12 @@ static int vfio_device_fops_release(struct inode > *inode, struct file *filep) > > struct vfio_device_file *df = filep->private_data; > > struct vfio_device *device = df->device; > > > > - vfio_device_group_close(df); > > + /* > > + * group path supports multiple device open, while cdev doesn't. > > + * So use vfio_device_group_close() for !singel_open case. > > + */ > > + if (!df->single_open) > > + vfio_device_group_close(df); > > If we're going to use this to differentiate group vs cdev use cases, > then let's name it something to reflect that rather than pretending it > only limits the number of opens, ex. is_cdev_device. Thanks, Yes. I'd follow it. Kevin has a similar comment on it. Regards, Yi Liu