[...]
There was the case for "FOLL_PIN represents application behavior and
should trigger NUMA faults", but I guess that can be ignored.
But it would be much better to just remove all that if we can.
Let me look into some details.
I just stumbled over the comment from Mel in follow_trans_huge_pmd():
/* Full NUMA hinting faults to serialise migration in fault paths */
It dates back to
commit 2b4847e73004c10ae6666c2e27b5c5430aed8698
Author: Mel Gorman <mgorman@xxxxxxx>
Date: Wed Dec 18 17:08:32 2013 -0800
mm: numa: serialise parallel get_user_page against THP migration
Base pages are unmapped and flushed from cache and TLB during normal
page migration and replaced with a migration entry that causes any
parallel NUMA hinting fault or gup to block until migration completes.
THP does not unmap pages due to a lack of support for migration entries
at a PMD level. This allows races with get_user_pages and
get_user_pages_fast which commit 3f926ab945b6 ("mm: Close races between
THP migration and PMD numa clearing") made worse by introducing a
pmd_clear_flush().
This patch forces get_user_page (fast and normal) on a pmd_numa page to
go through the slow get_user_page path where it will serialise against
THP migration and properly account for the NUMA hinting fault. On the
migration side the page table lock is taken for each PTE update.
We nowadays do have migration entries at PMD level -- and setting FOLL_FORCE
could similarly trigger such a race.
So I suspect we're good.
--
Cheers,
David / dhildenb