Re: [PATCH] vfio/pci: make the vfio_pci_mmap_fault reentrant

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 9 Mar 2021 19:45:03 -0400
Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:

> On Tue, Mar 09, 2021 at 12:56:39PM -0700, Alex Williamson wrote:
> 
> > And I think this is what we end up with for the current code base:  
> 
> Yeah, that looks Ok
>  
> > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> > index 65e7e6b44578..2f247ab18c66 100644
> > +++ b/drivers/vfio/pci/vfio_pci.c
> > @@ -1568,19 +1568,24 @@ void vfio_pci_memory_unlock_and_restore(struct vfio_pci_device *vdev, u16 cmd)
> >  }
> >  
> >  /* Caller holds vma_lock */
> > -static int __vfio_pci_add_vma(struct vfio_pci_device *vdev,
> > -			      struct vm_area_struct *vma)
> > +struct vfio_pci_mmap_vma *__vfio_pci_add_vma(struct vfio_pci_device *vdev,
> > +					     struct vm_area_struct *vma)
> >  {
> >  	struct vfio_pci_mmap_vma *mmap_vma;
> >  
> > +	list_for_each_entry(mmap_vma, &vdev->vma_list, vma_next) {
> > +		if (mmap_vma->vma == vma)
> > +			return ERR_PTR(-EEXIST);
> > +	}
> > +
> >  	mmap_vma = kmalloc(sizeof(*mmap_vma), GFP_KERNEL);
> >  	if (!mmap_vma)
> > -		return -ENOMEM;
> > +		return ERR_PTR(-ENOMEM);
> >  
> >  	mmap_vma->vma = vma;
> >  	list_add(&mmap_vma->vma_next, &vdev->vma_list);
> >  
> > -	return 0;
> > +	return mmap_vma;
> >  }
> >  
> >  /*
> > @@ -1612,30 +1617,39 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
> >  {
> >  	struct vm_area_struct *vma = vmf->vma;
> >  	struct vfio_pci_device *vdev = vma->vm_private_data;
> > -	vm_fault_t ret = VM_FAULT_NOPAGE;
> > +	struct vfio_pci_mmap_vma *mmap_vma;
> > +	unsigned long vaddr, pfn;
> > +	vm_fault_t ret;
> >  
> >  	mutex_lock(&vdev->vma_lock);
> >  	down_read(&vdev->memory_lock);
> >  
> >  	if (!__vfio_pci_memory_enabled(vdev)) {
> >  		ret = VM_FAULT_SIGBUS;
> > -		mutex_unlock(&vdev->vma_lock);
> >  		goto up_out;
> >  	}
> >  
> > -	if (__vfio_pci_add_vma(vdev, vma)) {
> > -		ret = VM_FAULT_OOM;
> > -		mutex_unlock(&vdev->vma_lock);
> > +	mmap_vma = __vfio_pci_add_vma(vdev, vma);
> > +	if (IS_ERR(mmap_vma)) {
> > +		/* A concurrent fault might have already inserted the page */
> > +		ret = (PTR_ERR(mmap_vma) == -EEXIST) ? VM_FAULT_NOPAGE :
> > +						       VM_FAULT_OOM;  
> 
> I think -EEIXST should not be an error, lets just go down to the
> vmf_insert_pfn() and let the MM resolve the race naturally.
> 
> I suspect returning VM_FAULT_NOPAGE will be averse to the userspace if
> it hits this race??

Given the serialization on vma_lock, if the vma_list entry exists then
the full vma should already be populated, so I don't see the NOPAGE
issue you're worried about.  However, if we wanted to be more similar
to what we expect the new version to do, we could proceed through
re-inserting the pages on -EEXIST.  Zeng Tao's re-ordering to add the
vma_list entry only after successfully inserting all the pfns might work
better for that.
 
> Also the _prot does look needed at least due to the SME, but possibly
> also to ensure NC gets set..

If we need more than pgprot_decrypted(vma->vm_page_prot), please let us
know, but that's all we were getting from io_remap_pfn_range() afaict.
Thanks,

Alex




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux