On 02/04/2024 22:29, Barry Song wrote: > On Wed, Apr 3, 2024 at 7:46 AM David Hildenbrand <david@xxxxxxxxxx> wrote: >> >> On 28.03.24 10:51, Barry Song wrote: >>> From: Barry Song <v-songbaohua@xxxxxxxx> >>> >>> Profiling a system blindly with mTHP has become challenging due >>> to the lack of visibility into its operations. Presenting the >>> success rate of mTHP allocations appears to be pressing need. >>> >>> Recently, I've been experiencing significant difficulty debugging >>> performance improvements and regressions without these figures. >>> It's crucial for us to understand the true effectiveness of >>> mTHP in real-world scenarios, especially in systems with >>> fragmented memory. >>> >>> This patch sets up the framework for per-order mTHP counters, >>> starting with the introduction of alloc_success and alloc_fail >>> counters. Incorporating additional counters should now be >>> straightforward as well. >>> >>> The initial two unsigned longs for each event are unused, given >>> that order-0 and order-1 are not mTHP. Nonetheless, this refinement >>> improves code clarity. >>> >>> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> >>> --- >>> -v2: >>> * move to sysfs and provide per-order counters; David, Ryan, Willy >>> -v1: >>> https://lore.kernel.org/linux-mm/20240326030103.50678-1-21cnbao@xxxxxxxxx/ >>> >>> include/linux/huge_mm.h | 17 +++++++++++++ >>> mm/huge_memory.c | 54 +++++++++++++++++++++++++++++++++++++++++ >>> mm/memory.c | 3 +++ >>> 3 files changed, 74 insertions(+) >>> >>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >>> index e896ca4760f6..27fa26a22a8f 100644 >>> --- a/include/linux/huge_mm.h >>> +++ b/include/linux/huge_mm.h >>> @@ -264,6 +264,23 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, >>> enforce_sysfs, orders); >>> } >>> >>> +enum thp_event_item { >>> + THP_ALLOC_SUCCESS, >>> + THP_ALLOC_FAIL, >>> + NR_THP_EVENT_ITEMS >>> +}; >> >> I'm wondering if these should be ANON specific for now. We might want to >> add others (shmem, file) in the future. > > I've two ways to do that > 1. rename to ANON_THP_ALLOC, so that I can have SHMEM_THP_ALLOC, FILE_THP_ALLOC > in the future; > 2. let THP_ALLOC cover all of shmem, file and anon. > > following vmstat, actually 1 might be better as we have both THP_FAULT_ALLOC and > THP_FILE_ALLOC for pmd-mapped THP. > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > THP_FAULT_ALLOC, > THP_FAULT_FALLBACK, > THP_FAULT_FALLBACK_CHARGE, > THP_COLLAPSE_ALLOC, > THP_COLLAPSE_ALLOC_FAILED, > THP_FILE_ALLOC, > THP_FILE_FALLBACK, > THP_FILE_FALLBACK_CHARGE, > THP_FILE_MAPPED, > THP_SPLIT_PAGE, > THP_SPLIT_PAGE_FAILED, > THP_DEFERRED_SPLIT_PAGE, > THP_SPLIT_PMD, > THP_SCAN_EXCEED_NONE_PTE, > THP_SCAN_EXCEED_SWAP_PTE, > THP_SCAN_EXCEED_SHARED_PTE, > #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD > THP_SPLIT_PUD, > #endif > THP_ZERO_PAGE_ALLOC, > THP_ZERO_PAGE_ALLOC_FAILED, > THP_SWPOUT, > THP_SWPOUT_FALLBACK, > #endif > > And reading mm/shmem.c, obviously, shmem is using THP_FILE_ALLOC. > > I will rename it to ANON_THP_ALLOC in v3, let me know if you disagree :-) I don't think the name of the enum is important - its an implementation detail that can be changed. Its the name of the sysfs file that matters. Although of course its nice to keep them in sync from a maintenance pov. Currently they are called "alloc_success" and "alloc_fail" I believe? Perhaps "anon_alloc" and "anon_alloc_fallback" are a bit more in keeping with vmstat? I'm assuming that: vmstat:thp_fault_alloc == hugepages-2048kB/stats/anon_alloc vmstat:thp_fault_alloc_fallback == hugepages-2048kB/stats/anon_alloc_fallback ? Thanks, Ryan > >> >> -- >> Cheers, >> >> David / dhildenb >> > > Thanks > Barry