Re: [PATCH 1/2] mm/numa: no task_numa_fault() call if page table is changed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8 Aug 2024, at 4:22, David Hildenbrand wrote:

> On 08.08.24 05:19, Baolin Wang wrote:
>>
>>
>> On 2024/8/8 02:47, Zi Yan wrote:
>>> When handling a numa page fault, task_numa_fault() should be called by a
>>> process that restores the page table of the faulted folio to avoid
>>> duplicated stats counting. Commit b99a342d4f11 ("NUMA balancing: reduce
>>> TLB flush via delaying mapping on hint page fault") restructured
>>> do_numa_page() and do_huge_pmd_numa_page() and did not avoid
>>> task_numa_fault() call in the second page table check after a numa
>>> migration failure. Fix it by making all !pte_same()/!pmd_same() return
>>> immediately.
>>>
>>> This issue can cause task_numa_fault() being called more than necessary
>>> and lead to unexpected numa balancing results (It is hard to tell whether
>>> the issue will cause positive or negative performance impact due to
>>> duplicated numa fault counting).
>>>
>>> Reported-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
>>> Closes: https://lore.kernel.org/linux-mm/87zfqfw0yw.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>>> Fixes: b99a342d4f11 ("NUMA balancing: reduce TLB flush via delaying mapping on hint page fault")
>>> Cc: <stable@xxxxxxxxxxxxxxx>
>>> Signed-off-by: Zi Yan <ziy@xxxxxxxxxx>
>>
>> The fix looks reasonable to me. Feel free to add:
>> Reviewed-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
>>
>> (Nit: These goto labels are a bit confusing and might need some cleanup
>> in the future.)
>
> Agreed, maybe we should simply handle that right away and replace the "goto out;" users by "return 0;".
>
> Then, just copy the 3 LOC.
>
> For mm/memory.c that would be:
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 67496dc5064f..410ba50ca746 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5461,7 +5461,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
>          if (unlikely(!pte_same(old_pte, vmf->orig_pte))) {
>                 pte_unmap_unlock(vmf->pte, vmf->ptl);
> -               goto out;
> +               return 0;
>         }
>          pte = pte_modify(old_pte, vma->vm_page_prot);
> @@ -5528,15 +5528,14 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
>                 vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
>                                                vmf->address, &vmf->ptl);
>                 if (unlikely(!vmf->pte))
> -                       goto out;
> +                       return 0;
>                 if (unlikely(!pte_same(ptep_get(vmf->pte), vmf->orig_pte))) {
>                         pte_unmap_unlock(vmf->pte, vmf->ptl);
> -                       goto out;
> +                       return 0;
>                 }
>                 goto out_map;
>         }
>  -out:
>         if (nid != NUMA_NO_NODE)
>                 task_numa_fault(last_cpupid, nid, nr_pages, flags);
>         return 0;
> @@ -5552,7 +5551,9 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
>                 numa_rebuild_single_mapping(vmf, vma, vmf->address, vmf->pte,
>                                             writable);
>         pte_unmap_unlock(vmf->pte, vmf->ptl);
> -       goto out;
> +       if (nid != NUMA_NO_NODE)
> +               task_numa_fault(last_cpupid, nid, nr_pages, flags);
> +       return 0;
>  }

Looks good to me. Thanks.

Hi Andrew,

Should I resend this for an easy back porting? Or you want to fold David’s
changes in directly?

Best Regards,
Yan, Zi

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux