On Thu, 30 May 2024 00:09:49 +0000 "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > Sent: Friday, May 24, 2024 3:56 AM > > > > -/* Caller holds vma_lock */ > > -static int __vfio_pci_add_vma(struct vfio_pci_core_device *vdev, > > - struct vm_area_struct *vma) > > +static int vma_to_pfn(struct vm_area_struct *vma, unsigned long *pfn) > > { > > - struct vfio_pci_mmap_vma *mmap_vma; > > - > > - mmap_vma = kmalloc(sizeof(*mmap_vma), GFP_KERNEL_ACCOUNT); > > - if (!mmap_vma) > > - return -ENOMEM; > > - > > - mmap_vma->vma = vma; > > - list_add(&mmap_vma->vma_next, &vdev->vma_list); > > + struct vfio_pci_core_device *vdev = vma->vm_private_data; > > + int index = vma->vm_pgoff >> (VFIO_PCI_OFFSET_SHIFT - > > PAGE_SHIFT); > > + u64 pgoff; > > > > - return 0; > > -} > > + if (index >= VFIO_PCI_ROM_REGION_INDEX || > > + !vdev->bar_mmap_supported[index] || !vdev->barmap[index]) > > + return -EINVAL; > > Is a WARN_ON() required here? If those checks fail vfio_pci_core_mmap() > will return error w/o installing vm_ops. I think these tests largely come from previous iterations of the patch where this function had more versatility, because yes, they do exactly duplicate tests that we would have already passed before we established this function in the vm_ops.fault path. We could therefore wrap this in a WARN_ON, but actually with the current usage it's really just a sanity test that vma->vm_pgoff hasn't changed. We don't change barmap or bar_mmap_supported while the device is opened. Is it all too much paranoia and we should remove the test entirely and have this function return void? > > @@ -2506,17 +2373,11 @@ static int vfio_pci_dev_set_hot_reset(struct > > vfio_device_set *dev_set, > > struct vfio_pci_group_info *groups, > > struct iommufd_ctx *iommufd_ctx) > > the comment before this function should be updated too: > > /* > * We need to get memory_lock for each device, but devices can share mmap_lock, > * therefore we need to zap and hold the vma_lock for each device, and only then > * get each memory_lock. > */ Good catch. I think I'd just delete this comment altogether and expand the existing comment in the loop body as: /* * Take the memory write lock for each device and zap BAR * mappings to prevent the user accessing the device while in * reset. Locking multiple devices is prone to deadlock, * runaway and unwind if we hit contention. */ if (!down_write_trylock(&vdev->memory_lock)) { ret = -EBUSY; break; } Sound good? Thanks, Alex