On Tue, Aug 01, 2023 at 02:48:40PM +0200, David Hildenbrand wrote: > Commit 0b9d705297b2 ("mm: numa: Support NUMA hinting page faults from > gup/gup_fast") from 2012 documented as the primary reason why we would want > to handle NUMA hinting faults from GUP: > > KVM secondary MMU page faults will trigger the NUMA hinting page > faults through gup_fast -> get_user_pages -> follow_page -> > handle_mm_fault. > > That is still the case today, and relevant KVM code has been converted to > manually set FOLL_HONOR_NUMA_FAULT. So let's stop setting > FOLL_HONOR_NUMA_FAULT for all GUP users and cross fingers that not that > many other ones that really require such handling for autonuma remain. > > Possible interaction with MMU notifiers: > > Assume a driver obtains a page using get_user_pages() to map it into > a secondary MMU, and uses the MMU notifier framework to get notified on > changes. > > Assume get_user_pages() succeeded on a PROT_NONE-mapped page (because > FOLL_HONOR_NUMA_FAULT is not set) in an accessible VMA and the page is > mapped into a secondary MMU. Once user space would turn that mapping > inaccessible using mprotect(PROT_NONE), the actual PTE in the page table > might not change. If the MMU notifier would be smart and optimize for that > case "why notify if the PTE didn't change", that could be problematic. > > At least change_pmd_range() with MMU_NOTIFY_PROTECTION_VMA for now does an > unconditional mmu_notifier_invalidate_range_start() -> > mmu_notifier_invalidate_range_end() and should be fine. > > Note that even if a PTE in an accessible VMA is pte_protnone(), the > underlying page might be accessed by a secondary MMU that does not set > FOLL_HONOR_NUMA_FAULT, and test_young() MMU notifiers would return "true". > > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> Also seems sane but a large portion of its correctness also depends on patch 3 being correct. -- Mel Gorman SUSE Labs