On Mon, Mar 16, 2020 at 10:02:50AM +0100, Christoph Hellwig wrote: > On Wed, Mar 11, 2020 at 03:35:00PM -0300, Jason Gunthorpe wrote: > > @@ -694,6 +672,15 @@ long hmm_range_fault(struct hmm_range *range, unsigned int flags) > > return -EBUSY; > > ret = walk_page_range(mm, hmm_vma_walk.last, range->end, > > &hmm_walk_ops, &hmm_vma_walk); > > + /* > > + * A pgmap is kept cached in the hmm_vma_walk to avoid expensive > > + * searching in the probably common case that the pgmap is the > > + * same for the entire requested range. > > + */ > > + if (hmm_vma_walk.pgmap) { > > + put_dev_pagemap(hmm_vma_walk.pgmap); > > + hmm_vma_walk.pgmap = NULL; > > + } > > } while (ret == -EBUSY); > > In which case it should only be put on return, and not for every loop. I chose this to be simple without having to goto unwind it. So, instead like this: @@ -683,21 +661,33 @@ long hmm_range_fault(struct hmm_range *range, unsigned int flags) .flags = flags, }; struct mm_struct *mm = range->notifier->mm; - int ret; + long ret; lockdep_assert_held(&mm->mmap_sem); do { /* If range is no longer valid force retry. */ if (mmu_interval_check_retry(range->notifier, - range->notifier_seq)) - return -EBUSY; + range->notifier_seq)) { + ret = -EBUSY; + goto out; + } ret = walk_page_range(mm, hmm_vma_walk.last, range->end, &hmm_walk_ops, &hmm_vma_walk); } while (ret == -EBUSY); if (ret) - return ret; - return (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; + goto out; + ret = (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; + +out: + /* + * A pgmap is kept cached in the hmm_vma_walk to avoid expensive + * searching in the probably common case that the pgmap is the + * same for the entire requested range. + */ + if (hmm_vma_walk.pgmap) + put_dev_pagemap(hmm_vma_walk.pgmap); + return ret; } EXPORT_SYMBOL(hmm_range_fault); ? > I still think the right fix is to just delete all the unused and broken > pgmap handling code. If we ever need to add it back it can be added > in a proper understood and tested way. What I want to add is something like if (pgmap != walk->required_pgmap) cpu_flags = 0 hmm_range_need_fault(..., cpu_flags, ...) Which will fix a bug in nouveau where it blindly assumes any device pages are its own, IIRC. I think Ralph observed it needs to be here, because if the pgmap doesn't match then it should trigger migration, in a single call, rather than iterating. I'm mostly expecting to replace all the other pgmap code, but keep the pgmap caching. The existing pgmap stuff seems approx right, but useless.. Jason