Re: [PATCH v3] mm: add per-order mTHP alloc_success and alloc_fail counters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05.04.24 11:24, Barry Song wrote:
On Fri, Apr 5, 2024 at 10:04 PM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 05.04.24 09:21, Ryan Roberts wrote:
On 05/04/2024 07:29, David Hildenbrand wrote:
On 05.04.24 06:01, Barry Song wrote:
On Fri, Apr 5, 2024 at 3:57 PM Barry Song <21cnbao@xxxxxxxxx> wrote:

On Fri, Apr 5, 2024 at 4:31 AM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 04.04.24 09:21, Ryan Roberts wrote:
On 03/04/2024 22:00, Barry Song wrote:
On Thu, Apr 4, 2024 at 12:48 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:

On 03/04/2024 09:22, David Hildenbrand wrote:
On 03.04.24 05:55, Barry Song wrote:
From: Barry Song <v-songbaohua@xxxxxxxx>

Profiling a system blindly with mTHP has become challenging due
to the lack of visibility into its operations. Presenting the
success rate of mTHP allocations appears to be pressing need.

Recently, I've been experiencing significant difficulty debugging
performance improvements and regressions without these figures.
It's crucial for us to understand the true effectiveness of
mTHP in real-world scenarios, especially in systems with
fragmented memory.

This patch sets up the framework for per-order mTHP counters,
starting with the introduction of anon_alloc_success and
anon_alloc_fail counters.  Incorporating additional counters
should now be straightforward as well.

Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx>
---
      -v3:
      * save some memory as order-0 and order-1 can't be THP, Ryan;
      * rename to anon_alloc as right now we only support anon to address
        David's comment;
      * drop a redundant "else", Ryan

      include/linux/huge_mm.h | 18 ++++++++++++++
      mm/huge_memory.c        | 54 +++++++++++++++++++++++++++++++++++++++++
      mm/memory.c             |  2 ++
      3 files changed, 74 insertions(+)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index e896ca4760f6..5e9af6be9537 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -70,6 +70,7 @@ extern struct kobj_attribute shmem_enabled_attr;
       * (which is a limitation of the THP implementation).
       */
      #define THP_ORDERS_ALL_ANON    ((BIT(PMD_ORDER + 1) - 1) & ~(BIT(0)
| BIT(1)))
+#define THP_MIN_ORDER        2
        /*
       * Mask of all large folio orders supported for file THP.
@@ -264,6 +265,23 @@ unsigned long thp_vma_allowable_orders(struct
vm_area_struct *vma,
                            enforce_sysfs, orders);
      }
      +enum thp_event_item {
+    THP_ANON_ALLOC_SUCCESS,
+    THP_ANON_ALLOC_FAIL,
+    NR_THP_EVENT_ITEMS
+};

Maybe use a prefix that resembles matches the enum name and is "obviously"
different to the ones in vm_event_item.h, like

enum thp_event {
        THP_EVENT_ANON_ALLOC_SUCCESS,
        THP_EVENT_ANON_ALLOC_FAIL,
        __THP_EVENT_COUNT,
};

FWIW, I'd personally replace "event" with "stat". For me "event" only ever
increments, but "stat" can increment and decrement. An event is a type of
stat.

You are only adding events for now, but we have identified a need for
inc/dec
stats that will be added in future.

What about the below?

enum thp_stat {

It seems we still need to use enum thp_stat_item rather than thp_stat.
This follows
enum zone_stat_item
enum numa_stat_item
enum node_stat_item


Right, we can do that.

[...]

ok.

Hi David, Ryan,

I've named everything as follows. Please let me know if you have any further
suggestions before I send the updated version :-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index e896ca4760f6..cc13fa14aa32 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -264,6 +264,23 @@ unsigned long thp_vma_allowable_orders(struct
vm_area_struct *vma,
                                             enforce_sysfs, orders);
    }

+enum thp_stat_item {
+       THP_STAT_ANON_ALLOC,
+       THP_STAT_ANON_ALLOC_FALLBACK,
+       __THP_STAT_COUNT
+};
+


Final bikeshedding: so we want to call all of that MTHP_, mthp_ etc, to make it
clearer that this is per mTHP size?

Perviously we concluded that mTHP was only for commit logs, and in the longer
term we wanted THP to mean the full set of sizes like it does for hugetlb.
Neither "mTHP" nor "multi-size THP" currently exist in the code base.

True, but some indication that these are "per THP size" might be reasonable.

It seems quite reasonable, especially considering the existence of THP counters.
Additionally, there's even an overlap between THP counters and mTHP counters,
specifically for PMD_ORDER. This complexity could potentially confuse
readers of the code.

I'm perfectly fine with using "M/m" as a prefix, as long as sysfs
hasn't exposed it.
Moreover, "M/m" remains a preferable prefix given that "PER_SIZE" is
overly long,
and people are already familiar with the concept of mTHP.

I think the important part is to not use "mTHP" in any of our ABI. As we expose these stats per THP size, using mTHP internally should be ok. But I'm happy to hear alternatives that express "per THP size" in a better way.

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux