On Thu, Feb 27, 2025 at 2:34 AM Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> wrote: > > > > On 2025/2/27 00:23, Brian Geffon wrote: > > When handling faults for anon shmem finish_fault() will attempt to install > > ptes for the entire folio. Unfortunately if it encounters a single > > non-pte_none entry in that range it will bail, even if the pte that > > triggered the fault is still pte_none. When this situation happens the > > fault will be retried endlessly never making forward progress. > > > > This patch fixes this behavior and if it detects that a pte in the range > > is not pte_none it will fall back to setting a single pte. > > > > Cc: stable@xxxxxxxxxxxxxxx > > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > > Fixes: 43e027e41423 ("mm: memory: extend finish_fault() to support large folio") > > Suggested-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> > > Reported-by: Marek Maslanka <mmaslanka@xxxxxxxxxx> > > Signed-off-by: Brian Geffon <bgeffon@xxxxxxxxxx> > > --- > > mm/memory.c | 15 ++++++++++----- > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index b4d3d4893267..b6c467fdbfa4 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -5183,7 +5183,11 @@ vm_fault_t finish_fault(struct vm_fault *vmf) > > bool is_cow = (vmf->flags & FAULT_FLAG_WRITE) && > > !(vma->vm_flags & VM_SHARED); > > int type, nr_pages; > > - unsigned long addr = vmf->address; > > + unsigned long addr; > > + bool needs_fallback = false; > > + > > +fallback: > > + addr = vmf->address; > > > > /* Did we COW the page? */ > > if (is_cow) > > @@ -5222,7 +5226,8 @@ vm_fault_t finish_fault(struct vm_fault *vmf) > > * approach also applies to non-anonymous-shmem faults to avoid > > * inflating the RSS of the process. > > */ > > - if (!vma_is_anon_shmem(vma) || unlikely(userfaultfd_armed(vma))) { > > + if (!vma_is_anon_shmem(vma) || unlikely(userfaultfd_armed(vma)) || > > + unlikely(needs_fallback)) { > > Nit: can you align the code? Otherwise look good to me. I mailed a v3 with adjusted alignment, I'll let Andrew decide which variation he prefers. > > > nr_pages = 1; > > } else if (nr_pages > 1) { > > pgoff_t idx = folio_page_idx(folio, page); > > @@ -5258,9 +5263,9 @@ vm_fault_t finish_fault(struct vm_fault *vmf) > > ret = VM_FAULT_NOPAGE; > > goto unlock; > > } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { > > - update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); > > - ret = VM_FAULT_NOPAGE; > > - goto unlock; > > + needs_fallback = true; > > + pte_unmap_unlock(vmf->pte, vmf->ptl); > > + goto fallback; > > } > > > > folio_ref_add(folio, nr_pages - 1); Thanks for looking Brian