Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> writes: > On 2023/10/10 20:33, Matthew Wilcox wrote: >> On Tue, Oct 10, 2023 at 02:45:38PM +0800, Kefeng Wang wrote: >>> At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of >>> them don't support numa balancing, and the page struct is aligned >>> to _struct_page_alignment, it is safe to move _last_cpupid before >>> 'virtual' in page, meanwhile, add it into folio, which make us to >>> use folio->_last_cpupid directly. >> What do you mean by "safe"? I think you mean "Does not increase the >> size of struct page", but if that is what you mean, why not just say so? >> If there's something else you mean, please explain. > > Don't increase size of struct page and don't impact the real order of > struct page as the above three archs without numa balancing support. > >> In any event, I'd like to see some reasoning that _last_cpupid is >> actually >> information which is logically maintained on a per-allocation basis, >> not a per-page basis (I think this is true, but I honestly don't know) > > The _last_cpupid is updated in should_numa_migrate_memory() from numa > fault(do_numa_page, and do_huge_pmd_numa_page), it is per-page(normal > page and PMD-mapped page). Maybe I misunderstand your mean, please > correct me. Because PTE mapped THP will not be migrated according to comments and folio_test_large() test in do_numa_page(). Only _last_cpuid of the head page will be used (that is, on per-allocation basis). Although in change_pte_range() in mprotect.c, _last_cpuid of tail pages may be changed, they are not used actually. All in all, _last_cpuid is on per-allocation basis for now. In the future, it's hard to say. PTE-mapped THPs or large folios give us an opportunity to check whether the different parts of a folio are accessed by multiple sockets, so that we should split the folio. But this is just some possibility in the future. -- Best Regards, Huang, Ying