David Hildenbrand <david@xxxxxxxxxx> writes: > On 27.03.24 09:21, Huang, Ying wrote: >> Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes: >> >>> On 2024/3/27 10:04, Huang, Ying wrote: >>>> Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes: >>>> >>>>> Now the anonymous page allocation already supports multi-size THP (mTHP), >>>>> but the numa balancing still prohibits mTHP migration even though it is an >>>>> exclusive mapping, which is unreasonable. >>>>> >>>>> Allow scanning mTHP: >>>>> Commit 859d4adc3415 ("mm: numa: do not trap faults on shared data section >>>>> pages") skips shared CoW pages' NUMA page migration to avoid shared data >>>>> segment migration. In addition, commit 80d47f5de5e3 ("mm: don't try to >>>>> NUMA-migrate COW pages that have other uses") change to use page_count() >>>>> to avoid GUP pages migration, that will also skip the mTHP numa scaning. >>>>> Theoretically, we can use folio_maybe_dma_pinned() to detect the GUP >>>>> issue, although there is still a GUP race, the issue seems to have been >>>>> resolved by commit 80d47f5de5e3. Meanwhile, use the folio_likely_mapped_shared() >>>>> to skip shared CoW pages though this is not a precise sharers count. To >>>>> check if the folio is shared, ideally we want to make sure every page is >>>>> mapped to the same process, but doing that seems expensive and using >>>>> the estimated mapcount seems can work when running autonuma benchmark. >>>> Because now we can deal with shared mTHP, it appears even possible >>>> to >>>> remove folio_likely_mapped_shared() check? >>> >>> IMO, the issue solved by commit 859d4adc3415 is about shared CoW >>> mapping, and I prefer to measure it in another patch:) >> I mean we can deal with shared mTHP (by multiple threads or multiple >> processes) with this patch. Right? > > It's independent of the folio order. We don't want to mess with shared COW pages, see > > commit 859d4adc3415a64ccb8b0c50dc4e3a888dcb5805 > Author: Henry Willard <henry.willard@xxxxxxxxxx> > Date: Wed Jan 31 16:21:07 2018 -0800 > > mm: numa: do not trap faults on shared data section pages. > Workloads consisting of a large number of processes running > the same > program with a very large shared data segment may experience performance > problems when numa balancing attempts to migrate the shared cow pages. > This manifests itself with many processes or tasks in > TASK_UNINTERRUPTIBLE state waiting for the shared pages to be migrated. > ... > > that introduced this handling. Sorry, I misunderstood your words. -- Best Regards, Huang, Ying