Re: [PATCH v3 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 31 Oct 2024 15:03:34 -0400 Joshua Hahn <joshua.hahnjy@xxxxxxxxx> wrote:

> Andrew -- I am sorry to ask again, but do you think you can replace
> the 3rd section in the patch (3. Implementation Details) with the
> following paragraphs?

No problem.

: This patch introduces a new counter to memory.stat that tracks hugeTLB
: usage, only if hugeTLB accounting is done to memory.current.  This feature
: is enabled the same way hugeTLB accounting is enabled, via the
: memory_hugetlb_accounting mount flag for cgroupsv2.
: 
: 1. Why is this patch necessary?
: Currently, memcg hugeTLB accounting is an opt-in feature [1] that adds
: hugeTLB usage to memory.current.  However, the metric is not reported in
: memory.stat.  Given that users often interpret memory.stat as a breakdown
: of the value reported in memory.current, the disparity between the two
: reports can be confusing.  This patch solves this problem by including the
: metric in memory.stat as well, but only if it is also reported in
: memory.current (it would also be confusing if the value was reported in
: memory.stat, but not in memory.current)
: 
: Aside from the consistency between the two files, we also see benefits in
: observability.  Userspace might be interested in the hugeTLB footprint of
: cgroups for many reasons.  For instance, system admins might want to
: verify that hugeTLB usage is distributed as expected across tasks: i.e. 
: memory-intensive tasks are using more hugeTLB pages than tasks that don't
: consume a lot of memory, or are seen to fault frequently.  Note that this
: is separate from wanting to inspect the distribution for limiting purposes
: (in which case, hugeTLB controller makes more sense).
: 
: 2.  We already have a hugeTLB controller.  Why not use that?  It is true
: that hugeTLB tracks the exact value that we want.  In fact, by enabling
: the hugeTLB controller, we get all of the observability benefits that I
: mentioned above, and users can check the total hugeTLB usage, verify if it
: is distributed as expected, etc.
: 
: 3.  Implementation Details:
: In the alloc / free hugetlb functions, we call lruvec_stat_mod_folio
: regardless of whether memcg accounts hugetlb.  mem_cgroup_commit_charge
: which is called from alloc_hugetlb_folio will set memcg for the folio
: only if the CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING cgroup mount option is
: used, so lruvec_stat_mod_folio accounts per-memcg hugetlb counters only
: if the feature is enabled.  Regardless of whether memcg accounts for
: hugetlb, the newly added global counter is updated and shown in
: /proc/vmstat.
: 
: The global counter is added because vmstats is the preferred framework
: for cgroup stats.  It makes stat items consistent between global and
: cgroups.  It also provides a per-node breakdown, which is useful. 
: Because it does not use cgroup-specific hooks, we also keep generic MM
: code separate from memcg code.
: 
: With this said, there are 2 problems:
: (a) They are still not reported in memory.stat, which means the
:     disparity between the memcg reports are still there.
: (b) We cannot reasonably expect users to enable the hugeTLB controller
:     just for the sake of hugeTLB usage reporting, especially since
:     they don't have any use for hugeTLB usage enforcing [2].
: 
: [1] https://lore.kernel.org/all/20231006184629.155543-1-nphamcs@xxxxxxxxx/
: [2] Of course, we can't make a new patch for every feature that can be
:     duplicated. However, since the existing solution of enabling the
:     hugeTLB controller is an imperfect solution that still leaves a
:     discrepancy between memory.stat and memory.curent, I think that it
:     is reasonable to isolate the feature in this case.





[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux