On Tue, Aug 16, 2022 at 05:24:07PM +0800, Kefeng Wang wrote: > > On 2022/8/16 16:48, huang ying wrote: > > On Tue, Aug 16, 2022 at 4:38 PM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote: > > > From: Liu Shixin <liushixin2@xxxxxxxxxx> > > > > > > The page on pcplist could be used, but not counted into memory free or > > > avaliable, and pcp_free is only showed by show_mem(). Since commit > > > d8a759b57035 ("mm, page_alloc: double zone's batchsize"), there is a > > > significant decrease in the display of free memory, with a large number > > > of cpus and nodes, the number of pages in the percpu list can be very > > > large, so it is better to let user to know the pcp count. > > Can you show some data? > > 80M+ with 72cpus/2node > 80M+ for a 2 node system doesn't sound like a significant number. > > > > Another choice is to count PCP free pages in MemFree. Is that OK for > > your use case too? > > Yes, the user will make policy according to MemFree, we think count PCP free > pages > > in MemFree is better, but don't know whether it is right way. > Is there a real problem where user makes a sub-optimal policy due to the not accounted 80M+ free memory? Counting PCP pages as free seems natural, since they are indeed free pages. One concern is, there might be much more calls of __mod_zone_freepage_state() if you do free page counting for PCP pages, not sure if that would hurt performance. Also, you will need to differentiate in __free_one_page() whether counting for free pages are still needed since some pages are freed through PCP(and thus already counted) while some are not. BTW, since commit b92ca18e8ca59("mm/page_alloc: disassociate the pcp->high from pcp->batch"), pcp size is no longer associated with batch size. Is it that you are testing on an older kernel? Thanks, Aaron > > > Signed-off-by: Liu Shixin <liushixin2@xxxxxxxxxx> > > > Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> > > > --- > > > drivers/base/node.c | 14 +++++++++++++- > > > fs/proc/meminfo.c | 9 +++++++++ > > > 2 files changed, 22 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/base/node.c b/drivers/base/node.c > > > index eb0f43784c2b..846864e45db6 100644 > > > --- a/drivers/base/node.c > > > +++ b/drivers/base/node.c > > > @@ -375,6 +375,9 @@ static ssize_t node_read_meminfo(struct device *dev, > > > struct sysinfo i; > > > unsigned long sreclaimable, sunreclaimable; > > > unsigned long swapcached = 0; > > > + unsigned long free_pcp = 0; > > > + struct zone *zone; > > > + int cpu; > > > > > > si_meminfo_node(&i, nid); > > > sreclaimable = node_page_state_pages(pgdat, NR_SLAB_RECLAIMABLE_B); > > > @@ -382,9 +385,17 @@ static ssize_t node_read_meminfo(struct device *dev, > > > #ifdef CONFIG_SWAP > > > swapcached = node_page_state_pages(pgdat, NR_SWAPCACHE); > > > #endif > > > + for_each_populated_zone(zone) { > > > + if (zone_to_nid(zone) != nid) > > > + continue; > > > + for_each_online_cpu(cpu) > > > + free_pcp += per_cpu_ptr(zone->per_cpu_pageset, cpu)->count; > > > + } > > > + > > > len = sysfs_emit_at(buf, len, > > > "Node %d MemTotal: %8lu kB\n" > > > "Node %d MemFree: %8lu kB\n" > > > + "Node %d PcpFree: %8lu kB\n" > > > "Node %d MemUsed: %8lu kB\n" > > > "Node %d SwapCached: %8lu kB\n" > > > "Node %d Active: %8lu kB\n" > > > @@ -397,7 +408,8 @@ static ssize_t node_read_meminfo(struct device *dev, > > > "Node %d Mlocked: %8lu kB\n", > > > nid, K(i.totalram), > > > nid, K(i.freeram), > > > - nid, K(i.totalram - i.freeram), > > > + nid, K(free_pcp), > > > + nid, K(i.totalram - i.freeram - free_pcp), > > > nid, K(swapcached), > > > nid, K(node_page_state(pgdat, NR_ACTIVE_ANON) + > > > node_page_state(pgdat, NR_ACTIVE_FILE)), > > > diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c > > > index 6e89f0e2fd20..672c784dfc8a 100644 > > > --- a/fs/proc/meminfo.c > > > +++ b/fs/proc/meminfo.c > > > @@ -38,6 +38,9 @@ static int meminfo_proc_show(struct seq_file *m, void *v) > > > unsigned long pages[NR_LRU_LISTS]; > > > unsigned long sreclaimable, sunreclaim; > > > int lru; > > > + unsigned long free_pcp = 0; > > > + struct zone *zone; > > > + int cpu; > > > > > > si_meminfo(&i); > > > si_swapinfo(&i); > > > @@ -55,8 +58,14 @@ static int meminfo_proc_show(struct seq_file *m, void *v) > > > sreclaimable = global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B); > > > sunreclaim = global_node_page_state_pages(NR_SLAB_UNRECLAIMABLE_B); > > > > > > + for_each_populated_zone(zone) { > > > + for_each_online_cpu(cpu) > > > + free_pcp += per_cpu_ptr(zone->per_cpu_pageset, cpu)->count; > > > + } > > > + > > > show_val_kb(m, "MemTotal: ", i.totalram); > > > show_val_kb(m, "MemFree: ", i.freeram); > > > + show_val_kb(m, "PcpFree: ", free_pcp); > > > show_val_kb(m, "MemAvailable: ", available); > > > show_val_kb(m, "Buffers: ", i.bufferram); > > > show_val_kb(m, "Cached: ", cached); > > > -- > > > 2.35.3 > > > > > > > > .