Re: [PATCH v3 3/3] vfio-pci: Invalidate mmaps and block MMIO access on disabled memory

Peter Xu <peterx@xxxxxxxxxx> · Sat, 23 May 2020 20:02:01 -0400



(CC Andrea too)

On Sat, May 23, 2020 at 07:52:57PM -0400, Peter Xu wrote:
> On Sat, May 23, 2020 at 05:06:02PM -0600, Alex Williamson wrote:
> > On Sat, 23 May 2020 15:34:17 -0400
> > Peter Xu <peterx@xxxxxxxxxx> wrote:
> > 
> > > Hi, Alex,
> > > 
> > > On Fri, May 22, 2020 at 01:17:43PM -0600, Alex Williamson wrote:
> > > > @@ -1346,15 +1526,32 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
> > > >  {
> > > >  	struct vm_area_struct *vma = vmf->vma;
> > > >  	struct vfio_pci_device *vdev = vma->vm_private_data;
> > > > +	vm_fault_t ret = VM_FAULT_NOPAGE;
> > > > +
> > > > +	mutex_lock(&vdev->vma_lock);
> > > > +	down_read(&vdev->memory_lock);  
> > > 
> > > I remembered to have seen the fault() handling FAULT_FLAG_RETRY_NOWAIT at least
> > > in the very first version, but it's not here any more...  Could I ask what's
> > > the reason behind?  I probably have missed something along with the versions,
> > > I'm just not sure whether e.g. this would potentially block a GUP caller even
> > > if it's with FOLL_NOWAIT.
> > 
> > This is largely what v2 was about, from the cover letter:
> > 
> >     Locking in 3/ is substantially changed to avoid the retry scenario
> >     within the fault handler, therefore a caller who does not allow
> >     retry will no longer receive a SIGBUS on contention.
> > 
> > The discussion thread starts here:
> > 
> > https://lore.kernel.org/kvm/20200501234849.GQ26002@xxxxxxxx/
> 
> [1]
> 
> > 
> > Feel free to interject if there's something that doesn't make sense,
> > the idea is that since we've fixed the lock ordering we never need to
> > release one lock to wait for another, therefore we can wait for the
> > lock.  I'm under the impression that we can wait for the lock
> > regardless of the flags under these conditions.
> 
> I see; thanks for the link.  Sorry I should probably follow up the discussion
> and ask the question earlier, anyway...
> 
> For what I understand now, IMHO we should still need all those handlings of
> FAULT_FLAG_RETRY_NOWAIT like in the initial version.  E.g., IIUC KVM gup will
> try with FOLL_NOWAIT when async is allowed, before the complete slow path.  I'm
> not sure what would be the side effect of that if fault() blocked it.  E.g.,
> the caller could be in an atomic context.
> 
> But now I also agree that VM_FAULT_SIGBUS is probably not correct there in the
> initial version [1] - I thought it was OK initially (after all after the
> multiple fault retry series we should always be with FAULT_FLAG_ALLOW_RETRY..).
> However after some thinking... it should be the common slow path where retry is
> simply not allowed.  So IMHO instead of SIGBUS there, we should also use all
> the slow path of the locks.  That'll be safe then because it's never going to
> be with FAULT_FLAG_RETRY_NOWAIT (FAULT_FLAG_RETRY_NOWAIT depends on
> FAULT_FLAG_ALLOW_RETRY).
> 
> A reference code could be __lock_page_or_retry() where the lock_page could wait
> just like we taking the sems/mutexes, and the previous SIGBUS case would
> corresponds to this chunk of __lock_page_or_retry():
> 
> 	} else {
> 		if (flags & FAULT_FLAG_KILLABLE) {
> 			int ret;
> 
> 			ret = __lock_page_killable(page);
> 			if (ret) {
> 				up_read(&mm->mmap_sem);
> 				return 0;
> 			}
> 		} else
> 			__lock_page(page);
> 		return 1;
> 	}
> 
> Thanks,
> 
> -- 
> Peter Xu

-- 
Peter Xu