On 8 Aug 2024, at 4:22, David Hildenbrand wrote: > On 08.08.24 05:19, Baolin Wang wrote: >> >> >> On 2024/8/8 02:47, Zi Yan wrote: >>> When handling a numa page fault, task_numa_fault() should be called by a >>> process that restores the page table of the faulted folio to avoid >>> duplicated stats counting. Commit b99a342d4f11 ("NUMA balancing: reduce >>> TLB flush via delaying mapping on hint page fault") restructured >>> do_numa_page() and do_huge_pmd_numa_page() and did not avoid >>> task_numa_fault() call in the second page table check after a numa >>> migration failure. Fix it by making all !pte_same()/!pmd_same() return >>> immediately. >>> >>> This issue can cause task_numa_fault() being called more than necessary >>> and lead to unexpected numa balancing results (It is hard to tell whether >>> the issue will cause positive or negative performance impact due to >>> duplicated numa fault counting). >>> >>> Reported-by: "Huang, Ying" <ying.huang@xxxxxxxxx> >>> Closes: https://lore.kernel.org/linux-mm/87zfqfw0yw.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ >>> Fixes: b99a342d4f11 ("NUMA balancing: reduce TLB flush via delaying mapping on hint page fault") >>> Cc: <stable@xxxxxxxxxxxxxxx> >>> Signed-off-by: Zi Yan <ziy@xxxxxxxxxx> >> >> The fix looks reasonable to me. Feel free to add: >> Reviewed-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> >> >> (Nit: These goto labels are a bit confusing and might need some cleanup >> in the future.) > > Agreed, maybe we should simply handle that right away and replace the "goto out;" users by "return 0;". > > Then, just copy the 3 LOC. > > For mm/memory.c that would be: > > diff --git a/mm/memory.c b/mm/memory.c > index 67496dc5064f..410ba50ca746 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -5461,7 +5461,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) > if (unlikely(!pte_same(old_pte, vmf->orig_pte))) { > pte_unmap_unlock(vmf->pte, vmf->ptl); > - goto out; > + return 0; > } > pte = pte_modify(old_pte, vma->vm_page_prot); > @@ -5528,15 +5528,14 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) > vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, > vmf->address, &vmf->ptl); > if (unlikely(!vmf->pte)) > - goto out; > + return 0; > if (unlikely(!pte_same(ptep_get(vmf->pte), vmf->orig_pte))) { > pte_unmap_unlock(vmf->pte, vmf->ptl); > - goto out; > + return 0; > } > goto out_map; > } > -out: > if (nid != NUMA_NO_NODE) > task_numa_fault(last_cpupid, nid, nr_pages, flags); > return 0; > @@ -5552,7 +5551,9 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) > numa_rebuild_single_mapping(vmf, vma, vmf->address, vmf->pte, > writable); > pte_unmap_unlock(vmf->pte, vmf->ptl); > - goto out; > + if (nid != NUMA_NO_NODE) > + task_numa_fault(last_cpupid, nid, nr_pages, flags); > + return 0; > } Looks good to me. Thanks. Hi Andrew, Should I resend this for an easy back porting? Or you want to fold David’s changes in directly? Best Regards, Yan, Zi
Attachment:
signature.asc
Description: OpenPGP digital signature