On 2019-07-01 2:20 a.m., Christoph Hellwig wrote: > We should not have two different error codes for the same condition. In > addition this really complicates the code due to the special handling of > EAGAIN that drops the mmap_sem due to the FAULT_FLAG_ALLOW_RETRY logic > in the core vm. I think the comment above hmm_range_snapshot needs an update. Also Documentation/vm/hmm.rst shows some example code using hmm_range_snapshot that retries on -EAGAIN. That would need to be updated to use -EBUSY or remove the retry logic altogether. Other than that, this patch is Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx> Philip, this means we should remove our retry logic again in amdgpu_ttm_tt_get_user_pages. According to the comment above hmm_range_fault, it can only return -EAGAIN if the block parameter is false. I think this statement is now actually true. We set block=true, so we can't get -EAGAIN. On -EBUSY we can let amdgpu_amdkfd_restore_userptr_worker schedule the retry (which it does already anyway). Regards, Felix > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > --- > mm/hmm.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/mm/hmm.c b/mm/hmm.c > index c85ed7d4e2ce..d125df698e2b 100644 > --- a/mm/hmm.c > +++ b/mm/hmm.c > @@ -974,7 +974,7 @@ long hmm_range_snapshot(struct hmm_range *range) > do { > /* If range is no longer valid force retry. */ > if (!range->valid) > - return -EAGAIN; > + return -EBUSY; > > vma = find_vma(hmm->mm, start); > if (vma == NULL || (vma->vm_flags & device_vma)) > @@ -1069,10 +1069,8 @@ long hmm_range_fault(struct hmm_range *range, bool block) > > do { > /* If range is no longer valid force retry. */ > - if (!range->valid) { > - up_read(&hmm->mm->mmap_sem); > - return -EAGAIN; > - } > + if (!range->valid) > + return -EBUSY; > > vma = find_vma(hmm->mm, start); > if (vma == NULL || (vma->vm_flags & device_vma))