(CC Andrea too) On Sat, May 23, 2020 at 07:52:57PM -0400, Peter Xu wrote: > On Sat, May 23, 2020 at 05:06:02PM -0600, Alex Williamson wrote: > > On Sat, 23 May 2020 15:34:17 -0400 > > Peter Xu <peterx@xxxxxxxxxx> wrote: > > > > > Hi, Alex, > > > > > > On Fri, May 22, 2020 at 01:17:43PM -0600, Alex Williamson wrote: > > > > @@ -1346,15 +1526,32 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf) > > > > { > > > > struct vm_area_struct *vma = vmf->vma; > > > > struct vfio_pci_device *vdev = vma->vm_private_data; > > > > + vm_fault_t ret = VM_FAULT_NOPAGE; > > > > + > > > > + mutex_lock(&vdev->vma_lock); > > > > + down_read(&vdev->memory_lock); > > > > > > I remembered to have seen the fault() handling FAULT_FLAG_RETRY_NOWAIT at least > > > in the very first version, but it's not here any more... Could I ask what's > > > the reason behind? I probably have missed something along with the versions, > > > I'm just not sure whether e.g. this would potentially block a GUP caller even > > > if it's with FOLL_NOWAIT. > > > > This is largely what v2 was about, from the cover letter: > > > > Locking in 3/ is substantially changed to avoid the retry scenario > > within the fault handler, therefore a caller who does not allow > > retry will no longer receive a SIGBUS on contention. > > > > The discussion thread starts here: > > > > https://lore.kernel.org/kvm/20200501234849.GQ26002@xxxxxxxx/ > > [1] > > > > > Feel free to interject if there's something that doesn't make sense, > > the idea is that since we've fixed the lock ordering we never need to > > release one lock to wait for another, therefore we can wait for the > > lock. I'm under the impression that we can wait for the lock > > regardless of the flags under these conditions. > > I see; thanks for the link. Sorry I should probably follow up the discussion > and ask the question earlier, anyway... > > For what I understand now, IMHO we should still need all those handlings of > FAULT_FLAG_RETRY_NOWAIT like in the initial version. E.g., IIUC KVM gup will > try with FOLL_NOWAIT when async is allowed, before the complete slow path. I'm > not sure what would be the side effect of that if fault() blocked it. E.g., > the caller could be in an atomic context. > > But now I also agree that VM_FAULT_SIGBUS is probably not correct there in the > initial version [1] - I thought it was OK initially (after all after the > multiple fault retry series we should always be with FAULT_FLAG_ALLOW_RETRY..). > However after some thinking... it should be the common slow path where retry is > simply not allowed. So IMHO instead of SIGBUS there, we should also use all > the slow path of the locks. That'll be safe then because it's never going to > be with FAULT_FLAG_RETRY_NOWAIT (FAULT_FLAG_RETRY_NOWAIT depends on > FAULT_FLAG_ALLOW_RETRY). > > A reference code could be __lock_page_or_retry() where the lock_page could wait > just like we taking the sems/mutexes, and the previous SIGBUS case would > corresponds to this chunk of __lock_page_or_retry(): > > } else { > if (flags & FAULT_FLAG_KILLABLE) { > int ret; > > ret = __lock_page_killable(page); > if (ret) { > up_read(&mm->mmap_sem); > return 0; > } > } else > __lock_page(page); > return 1; > } > > Thanks, > > -- > Peter Xu -- Peter Xu