On Wed, Mar 11, 2020 at 06:36:47PM -0700, Ralph Campbell wrote: > > @@ -390,8 +384,15 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, > > return -EBUSY; > > } > > return hmm_pfns_fill(start, end, range, HMM_PFN_NONE); > > - } else if (!pmd_present(pmd)) > > + } > > + > > + if (!pmd_present(pmd)) { > > + hmm_range_need_fault(hmm_vma_walk, pfns, npages, 0, &fault, > > + &write_fault); > > + if (fault || write_fault) > > + return -EFAULT; > > return hmm_pfns_fill(start, end, range, HMM_PFN_ERROR); > > Shouldn't this fill with HMM_PFN_NONE instead of HMM_PFN_ERROR? > Otherwise, when a THP is swapped out, you will get a different > value than if a PTE is swapped out and you are prefetching/snapshotting. If this is the case then the problem is that the return -EFAULT path needs to do something else.. ie since the above code can't trigger swap in, it is correct to return PFN_ERROR. I'm completely guessing, but do we need to call pmd_to_swp_entry() and handle things similarly to the pte? What swp_entries are valid for a pmd? Do you understand this better, or know how to trigger a !pmd_present for test? I suppose another option would be this: if (!pmd_present(pmd)) { hmm_range_need_fault(hmm_vma_walk, pfns, npages, 0, &fault, &write_fault); /* We can't handle this. Cause the PMD to be split and * handle it in the pte handler. */ if (fault || write_fault) return 0; return hmm_pfns_fill(start, end, range, HMM_PFN_NONE); } Which, I think, must be correct, but inefficient? Jason