Re: [PATCH v3] mm,memcg: provide per-cgroup counters for NUMA balancing operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 14 Aug 2024 17:42:27 +0000 kaiyang2@xxxxxxxxxx wrote:

> From: Kaiyang Zhao <kaiyang2@xxxxxxxxxx>
> 
> The ability to observe the demotion and promotion decisions made by the
> kernel on a per-cgroup basis is important for monitoring and tuning
> containerized workloads on either NUMA machines or machines
> equipped with tiered memory.
> 
> Different containers in the system may experience drastically different
> memory tiering actions that cannot be distinguished from the global
> counters alone.
> 
> For example, a container running a workload that has a much hotter
> memory accesses will likely see more promotions and fewer demotions,
> potentially depriving a colocated container of top tier memory to such
> an extent that its performance degrades unacceptably.
> 
> For another example, some containers may exhibit longer periods between
> data reuse, causing much more numa_hint_faults than numa_pages_migrated.
> In this case, tuning hot_threshold_ms may be appropriate, but the signal
> can easily be lost if only global counters are available.
> 
> This patch set adds seven counters to memory.stat in a cgroup:
> numa_pages_migrated, numa_pte_updates, numa_hint_faults, pgdemote_kswapd,
> pgdemote_khugepaged, pgdemote_direct and pgpromote_success. pgdemote_*
> and pgpromote_success are also available in memory.numa_stat.
> 
> count_memcg_events_mm() is added to count multiple event occurrences at
> once, and get_mem_cgroup_from_folio() is added because we need to get a
> reference to the memcg of a folio before it's migrated to track
> numa_pages_migrated. The accounting of PGDEMOTE_* is moved to
> shrink_inactive_list() before being changed to per-cgroup.
>
> ...
>
> @@ -1383,6 +1412,13 @@ static const struct memory_stat memory_stats[] = {
>  	{ "workingset_restore_anon",	WORKINGSET_RESTORE_ANON		},
>  	{ "workingset_restore_file",	WORKINGSET_RESTORE_FILE		},
>  	{ "workingset_nodereclaim",	WORKINGSET_NODERECLAIM		},
> +
> +	{ "pgdemote_kswapd",		PGDEMOTE_KSWAPD		},
> +	{ "pgdemote_direct",		PGDEMOTE_DIRECT		},
> +	{ "pgdemote_khugepaged",	PGDEMOTE_KHUGEPAGED	},
> +#ifdef CONFIG_NUMA_BALANCING
> +	{ "pgpromote_success",		PGPROMOTE_SUCCESS	},
> +#endif
>  };

Please document these in Documentation/admin-guide/cgroup-v2.rst





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux