Re: [PATCH 2/2] mm: support multi-size THP numa balancing

"Huang, Ying" <ying.huang@xxxxxxxxx> · Thu, 28 Mar 2024 09:09:46 +0800

David Hildenbrand <david@xxxxxxxxxx> writes:

> On 27.03.24 09:21, Huang, Ying wrote:
>> Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes:
>> 
>>> On 2024/3/27 10:04, Huang, Ying wrote:
>>>> Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes:
>>>>
>>>>> Now the anonymous page allocation already supports multi-size THP (mTHP),
>>>>> but the numa balancing still prohibits mTHP migration even though it is an
>>>>> exclusive mapping, which is unreasonable.
>>>>>
>>>>> Allow scanning mTHP:
>>>>> Commit 859d4adc3415 ("mm: numa: do not trap faults on shared data section
>>>>> pages") skips shared CoW pages' NUMA page migration to avoid shared data
>>>>> segment migration. In addition, commit 80d47f5de5e3 ("mm: don't try to
>>>>> NUMA-migrate COW pages that have other uses") change to use page_count()
>>>>> to avoid GUP pages migration, that will also skip the mTHP numa scaning.
>>>>> Theoretically, we can use folio_maybe_dma_pinned() to detect the GUP
>>>>> issue, although there is still a GUP race, the issue seems to have been
>>>>> resolved by commit 80d47f5de5e3. Meanwhile, use the folio_likely_mapped_shared()
>>>>> to skip shared CoW pages though this is not a precise sharers count. To
>>>>> check if the folio is shared, ideally we want to make sure every page is
>>>>> mapped to the same process, but doing that seems expensive and using
>>>>> the estimated mapcount seems can work when running autonuma benchmark.
>>>> Because now we can deal with shared mTHP, it appears even possible
>>>> to
>>>> remove folio_likely_mapped_shared() check?
>>>
>>> IMO, the issue solved by commit 859d4adc3415 is about shared CoW
>>> mapping, and I prefer to measure it in another patch:)
>> I mean we can deal with shared mTHP (by multiple threads or multiple
>> processes) with this patch.  Right?
>
> It's independent of the folio order. We don't want to mess with shared COW pages, see
>
> commit 859d4adc3415a64ccb8b0c50dc4e3a888dcb5805
> Author: Henry Willard <henry.willard@xxxxxxxxxx>
> Date:   Wed Jan 31 16:21:07 2018 -0800
>
>     mm: numa: do not trap faults on shared data section pages.
>          Workloads consisting of a large number of processes running
>         the same
>     program with a very large shared data segment may experience performance
>     problems when numa balancing attempts to migrate the shared cow pages.
>     This manifests itself with many processes or tasks in
>     TASK_UNINTERRUPTIBLE state waiting for the shared pages to be migrated.
> ...
>
> that introduced this handling.

Sorry, I misunderstood your words.

--
Best Regards,
Huang, Ying