Re: [PATCH v3] mm: add per-order mTHP alloc_success and alloc_fail counters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 5, 2024 at 10:04 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 05.04.24 09:21, Ryan Roberts wrote:
> > On 05/04/2024 07:29, David Hildenbrand wrote:
> >> On 05.04.24 06:01, Barry Song wrote:
> >>> On Fri, Apr 5, 2024 at 3:57 PM Barry Song <21cnbao@xxxxxxxxx> wrote:
> >>>>
> >>>> On Fri, Apr 5, 2024 at 4:31 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
> >>>>>
> >>>>> On 04.04.24 09:21, Ryan Roberts wrote:
> >>>>>> On 03/04/2024 22:00, Barry Song wrote:
> >>>>>>> On Thu, Apr 4, 2024 at 12:48 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
> >>>>>>>>
> >>>>>>>> On 03/04/2024 09:22, David Hildenbrand wrote:
> >>>>>>>>> On 03.04.24 05:55, Barry Song wrote:
> >>>>>>>>>> From: Barry Song <v-songbaohua@xxxxxxxx>
> >>>>>>>>>>
> >>>>>>>>>> Profiling a system blindly with mTHP has become challenging due
> >>>>>>>>>> to the lack of visibility into its operations. Presenting the
> >>>>>>>>>> success rate of mTHP allocations appears to be pressing need.
> >>>>>>>>>>
> >>>>>>>>>> Recently, I've been experiencing significant difficulty debugging
> >>>>>>>>>> performance improvements and regressions without these figures.
> >>>>>>>>>> It's crucial for us to understand the true effectiveness of
> >>>>>>>>>> mTHP in real-world scenarios, especially in systems with
> >>>>>>>>>> fragmented memory.
> >>>>>>>>>>
> >>>>>>>>>> This patch sets up the framework for per-order mTHP counters,
> >>>>>>>>>> starting with the introduction of anon_alloc_success and
> >>>>>>>>>> anon_alloc_fail counters.  Incorporating additional counters
> >>>>>>>>>> should now be straightforward as well.
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx>
> >>>>>>>>>> ---
> >>>>>>>>>>      -v3:
> >>>>>>>>>>      * save some memory as order-0 and order-1 can't be THP, Ryan;
> >>>>>>>>>>      * rename to anon_alloc as right now we only support anon to address
> >>>>>>>>>>        David's comment;
> >>>>>>>>>>      * drop a redundant "else", Ryan
> >>>>>>>>>>
> >>>>>>>>>>      include/linux/huge_mm.h | 18 ++++++++++++++
> >>>>>>>>>>      mm/huge_memory.c        | 54 +++++++++++++++++++++++++++++++++++++++++
> >>>>>>>>>>      mm/memory.c             |  2 ++
> >>>>>>>>>>      3 files changed, 74 insertions(+)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> >>>>>>>>>> index e896ca4760f6..5e9af6be9537 100644
> >>>>>>>>>> --- a/include/linux/huge_mm.h
> >>>>>>>>>> +++ b/include/linux/huge_mm.h
> >>>>>>>>>> @@ -70,6 +70,7 @@ extern struct kobj_attribute shmem_enabled_attr;
> >>>>>>>>>>       * (which is a limitation of the THP implementation).
> >>>>>>>>>>       */
> >>>>>>>>>>      #define THP_ORDERS_ALL_ANON    ((BIT(PMD_ORDER + 1) - 1) & ~(BIT(0)
> >>>>>>>>>> | BIT(1)))
> >>>>>>>>>> +#define THP_MIN_ORDER        2
> >>>>>>>>>>        /*
> >>>>>>>>>>       * Mask of all large folio orders supported for file THP.
> >>>>>>>>>> @@ -264,6 +265,23 @@ unsigned long thp_vma_allowable_orders(struct
> >>>>>>>>>> vm_area_struct *vma,
> >>>>>>>>>>                            enforce_sysfs, orders);
> >>>>>>>>>>      }
> >>>>>>>>>>      +enum thp_event_item {
> >>>>>>>>>> +    THP_ANON_ALLOC_SUCCESS,
> >>>>>>>>>> +    THP_ANON_ALLOC_FAIL,
> >>>>>>>>>> +    NR_THP_EVENT_ITEMS
> >>>>>>>>>> +};
> >>>>>>>>>
> >>>>>>>>> Maybe use a prefix that resembles matches the enum name and is "obviously"
> >>>>>>>>> different to the ones in vm_event_item.h, like
> >>>>>>>>>
> >>>>>>>>> enum thp_event {
> >>>>>>>>>        THP_EVENT_ANON_ALLOC_SUCCESS,
> >>>>>>>>>        THP_EVENT_ANON_ALLOC_FAIL,
> >>>>>>>>>        __THP_EVENT_COUNT,
> >>>>>>>>> };
> >>>>>>>>
> >>>>>>>> FWIW, I'd personally replace "event" with "stat". For me "event" only ever
> >>>>>>>> increments, but "stat" can increment and decrement. An event is a type of
> >>>>>>>> stat.
> >>>>>>>>
> >>>>>>>> You are only adding events for now, but we have identified a need for
> >>>>>>>> inc/dec
> >>>>>>>> stats that will be added in future.
> >>>>>>>
> >>>>>>> What about the below?
> >>>>>>>
> >>>>>>> enum thp_stat {
> >>>>
> >>>> It seems we still need to use enum thp_stat_item rather than thp_stat.
> >>>> This follows
> >>>> enum zone_stat_item
> >>>> enum numa_stat_item
> >>>> enum node_stat_item
> >>
> >>
> >> Right, we can do that.
> >>
> >> [...]
> >>
> >>>> ok.
> >>>
> >>> Hi David, Ryan,
> >>>
> >>> I've named everything as follows. Please let me know if you have any further
> >>> suggestions before I send the updated version :-)
> >>>
> >>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> >>> index e896ca4760f6..cc13fa14aa32 100644
> >>> --- a/include/linux/huge_mm.h
> >>> +++ b/include/linux/huge_mm.h
> >>> @@ -264,6 +264,23 @@ unsigned long thp_vma_allowable_orders(struct
> >>> vm_area_struct *vma,
> >>>                                             enforce_sysfs, orders);
> >>>    }
> >>>
> >>> +enum thp_stat_item {
> >>> +       THP_STAT_ANON_ALLOC,
> >>> +       THP_STAT_ANON_ALLOC_FALLBACK,
> >>> +       __THP_STAT_COUNT
> >>> +};
> >>> +
> >>
> >>
> >> Final bikeshedding: so we want to call all of that MTHP_, mthp_ etc, to make it
> >> clearer that this is per mTHP size?
> >
> > Perviously we concluded that mTHP was only for commit logs, and in the longer
> > term we wanted THP to mean the full set of sizes like it does for hugetlb.
> > Neither "mTHP" nor "multi-size THP" currently exist in the code base.
>
> True, but some indication that these are "per THP size" might be reasonable.

It seems quite reasonable, especially considering the existence of THP counters.
Additionally, there's even an overlap between THP counters and mTHP counters,
specifically for PMD_ORDER. This complexity could potentially confuse
readers of the code.

I'm perfectly fine with using "M/m" as a prefix, as long as sysfs
hasn't exposed it.
Moreover, "M/m" remains a preferable prefix given that "PER_SIZE" is
overly long,
and people are already familiar with the concept of mTHP.

>
> --
> Cheers,
>
> David / dhildenb
>





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux