Re: [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 02, 2023 at 05:18:27PM -0700, Nhat Pham wrote:
> Currently, hugetlb memory usage is not acounted for in the memory
> controller, which could lead to memory overprotection for cgroups with
> hugetlb-backed memory. This has been observed in our production system.
> 
> For instance, here is one of our usecases: suppose there are two 32G
> containers. The machine is booted with hugetlb_cma=6G, and each
> container may or may not use up to 3 gigantic page, depending on the
> workload within it. The rest is anon, cache, slab, etc. We can set the
> hugetlb cgroup limit of each cgroup to 3G to enforce hugetlb fairness.
> But it is very difficult to configure memory.max to keep overall
> consumption, including anon, cache, slab etc. fair.
> 
> What we have had to resort to is to constantly poll hugetlb usage and
> readjust memory.max. Similar procedure is done to other memory limits
> (memory.low for e.g). However, this is rather cumbersome and buggy.
> Furthermore, when there is a delay in memory limits correction, (for e.g
> when hugetlb usage changes within consecutive runs of the userspace
> agent), the system could be in an over/underprotected state.
> 
> This patch rectifies this issue by charging the memcg when the hugetlb
> folio is utilized, and uncharging when the folio is freed (analogous to
> the hugetlb controller). Note that we do not charge when the folio is
> allocated to the hugetlb pool, because at this point it is not owned by
> any memcg.
> 
> Some caveats to consider:
>   * This feature is only available on cgroup v2.
>   * There is no hugetlb pool management involved in the memory
>     controller. As stated above, hugetlb folios are only charged towards
>     the memory controller when it is used. Host overcommit management
>     has to consider it when configuring hard limits.
>   * Failure to charge towards the memcg results in SIGBUS. This could
>     happen even if the hugetlb pool still has pages (but the cgroup
>     limit is hit and reclaim attempt fails).
>   * When this feature is enabled, hugetlb pages contribute to memory
>     reclaim protection. low, min limits tuning must take into account
>     hugetlb memory.
>   * Hugetlb pages utilized while this option is not selected will not
>     be tracked by the memory controller (even if cgroup v2 is remounted
>     later on).
> 
> Signed-off-by: Nhat Pham <nphamcs@xxxxxxxxx>

Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux