On Thu 27-07-17 12:23:55, Johannes Weiner wrote: > >From 14e3d7647b3cf524dbb005faaea96b00b6909c12 Mon Sep 17 00:00:00 2001 > From: Johannes Weiner <hannes@xxxxxxxxxxx> > Date: Thu, 27 Jul 2017 11:59:38 -0400 > Subject: [PATCH] mm: fix global NR_SLAB_.*CLAIMABLE counter reads > > As Tetsuo points out: > > Commit 385386cff4c6f047 ("mm: vmstat: move slab statistics from > zone to node counters") broke "Slab:" field of /proc/meminfo . It > shows nearly 0kB. > > In addition to /proc/meminfo, this problem also affects the slab > counters OOM/allocation failure info dumps, can cause early -ENOMEM > from overcommit protection, and miscalculate image size requirements > during suspend-to-disk. > > This is because the patch in question switched the slab counters from > the zone level to the node level, but forgot to update the global > accessor functions to read the aggregate node data instead of the > aggregate zone data. > > Use global_node_page_state() to access the global slab counters. > > Fixes: 385386cff4c6 ("mm: vmstat: move slab statistics from zone to node counters") > Reported-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> Looks good to me Acked-by: Michal Hocko <mhocko@xxxxxxxx> ... but global_page_state still seems to be very confusing. Do we want s@global_page_state@global_zone_page_state@? This would be in line with per-node statistics. Remaining callers $ git grep "global_page_state(NR_" | sed 's@.*(\(NR_[A-Z_]*\)).*@\1@' | sort | uniq -c 2 NR_BOUNCE 2 NR_FREE_CMA_PAGES 11 NR_FREE_PAGES 1 NR_KERNEL_STACK_KB 1 NR_MLOCK 2 NR_PAGETABLE seem to all be using it correctly. So what do you think about the follow up? --- >From 5bb27725cd74ad85c7ac86a5d275c36def0bdd8c Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@xxxxxxxx> Date: Fri, 28 Jul 2017 11:02:51 +0200 Subject: [PATCH] mm: rename global_page_state to global_zone_page_state global_page_state is error prone as a recent bug report pointed out [1]. It only returns proper values for zone based counters as the enum it gets suggests. We already have global_node_page_state so let's rename global_page_state to global_zone_page_state to be more explicit here. All existing users seems to be correct $ git grep "global_page_state(NR_" | sed 's@.*(\(NR_[A-Z_]*\)).*@\1@' | sort | uniq -c 2 NR_BOUNCE 2 NR_FREE_CMA_PAGES 11 NR_FREE_PAGES 1 NR_KERNEL_STACK_KB 1 NR_MLOCK 2 NR_PAGETABLE This patch shouldn't introduce any functional change. [1] http://lkml.kernel.org/r/201707260628.v6Q6SmaS030814@xxxxxxxxxxxxxxxxxxx Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> --- fs/proc/meminfo.c | 10 +++++----- include/linux/swap.h | 4 ++-- include/linux/vmstat.h | 4 ++-- mm/mmap.c | 6 +++--- mm/nommu.c | 4 ++-- mm/page-writeback.c | 4 ++-- mm/page_alloc.c | 12 ++++++------ mm/util.c | 2 +- mm/vmstat.c | 4 ++-- 9 files changed, 25 insertions(+), 25 deletions(-) diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 509a61668d90..cdd979724c74 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -80,7 +80,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) show_val_kb(m, "Active(file): ", pages[LRU_ACTIVE_FILE]); show_val_kb(m, "Inactive(file): ", pages[LRU_INACTIVE_FILE]); show_val_kb(m, "Unevictable: ", pages[LRU_UNEVICTABLE]); - show_val_kb(m, "Mlocked: ", global_page_state(NR_MLOCK)); + show_val_kb(m, "Mlocked: ", global_zone_page_state(NR_MLOCK)); #ifdef CONFIG_HIGHMEM show_val_kb(m, "HighTotal: ", i.totalhigh); @@ -114,9 +114,9 @@ static int meminfo_proc_show(struct seq_file *m, void *v) show_val_kb(m, "SUnreclaim: ", global_node_page_state(NR_SLAB_UNRECLAIMABLE)); seq_printf(m, "KernelStack: %8lu kB\n", - global_page_state(NR_KERNEL_STACK_KB)); + global_zone_page_state(NR_KERNEL_STACK_KB)); show_val_kb(m, "PageTables: ", - global_page_state(NR_PAGETABLE)); + global_zone_page_state(NR_PAGETABLE)); #ifdef CONFIG_QUICKLIST show_val_kb(m, "Quicklists: ", quicklist_total_size()); #endif @@ -124,7 +124,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) show_val_kb(m, "NFS_Unstable: ", global_node_page_state(NR_UNSTABLE_NFS)); show_val_kb(m, "Bounce: ", - global_page_state(NR_BOUNCE)); + global_zone_page_state(NR_BOUNCE)); show_val_kb(m, "WritebackTmp: ", global_node_page_state(NR_WRITEBACK_TEMP)); show_val_kb(m, "CommitLimit: ", vm_commit_limit()); @@ -151,7 +151,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) #ifdef CONFIG_CMA show_val_kb(m, "CmaTotal: ", totalcma_pages); show_val_kb(m, "CmaFree: ", - global_page_state(NR_FREE_CMA_PAGES)); + global_zone_page_state(NR_FREE_CMA_PAGES)); #endif hugetlb_report_meminfo(m); diff --git a/include/linux/swap.h b/include/linux/swap.h index d83d28e53e62..bf49b79218f4 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -262,8 +262,8 @@ extern unsigned long totalreserve_pages; extern unsigned long nr_free_buffer_pages(void); extern unsigned long nr_free_pagecache_pages(void); -/* Definition of global_page_state not available yet */ -#define nr_free_pages() global_page_state(NR_FREE_PAGES) +/* Definition of global_zone_page_state not available yet */ +#define nr_free_pages() global_zone_page_state(NR_FREE_PAGES) /* linux/mm/swap.c */ diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h index b3d85f30d424..97e11ab573f0 100644 --- a/include/linux/vmstat.h +++ b/include/linux/vmstat.h @@ -123,7 +123,7 @@ static inline void node_page_state_add(long x, struct pglist_data *pgdat, atomic_long_add(x, &vm_node_stat[item]); } -static inline unsigned long global_page_state(enum zone_stat_item item) +static inline unsigned long global_zone_page_state(enum zone_stat_item item) { long x = atomic_long_read(&vm_zone_stat[item]); #ifdef CONFIG_SMP @@ -199,7 +199,7 @@ extern unsigned long sum_zone_node_page_state(int node, extern unsigned long node_page_state(struct pglist_data *pgdat, enum node_stat_item item); #else -#define sum_zone_node_page_state(node, item) global_page_state(item) +#define sum_zone_node_page_state(node, item) global_zone_page_state(item) #define node_page_state(node, item) global_node_page_state(item) #endif /* CONFIG_NUMA */ diff --git a/mm/mmap.c b/mm/mmap.c index 24e9261bdcc0..4fd6ee5c2b46 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3510,7 +3510,7 @@ static int init_user_reserve(void) { unsigned long free_kbytes; - free_kbytes = global_page_state(NR_FREE_PAGES) << (PAGE_SHIFT - 10); + free_kbytes = global_zone_page_state(NR_FREE_PAGES) << (PAGE_SHIFT - 10); sysctl_user_reserve_kbytes = min(free_kbytes / 32, 1UL << 17); return 0; @@ -3531,7 +3531,7 @@ static int init_admin_reserve(void) { unsigned long free_kbytes; - free_kbytes = global_page_state(NR_FREE_PAGES) << (PAGE_SHIFT - 10); + free_kbytes = global_zone_page_state(NR_FREE_PAGES) << (PAGE_SHIFT - 10); sysctl_admin_reserve_kbytes = min(free_kbytes / 32, 1UL << 13); return 0; @@ -3575,7 +3575,7 @@ static int reserve_mem_notifier(struct notifier_block *nb, break; case MEM_OFFLINE: - free_kbytes = global_page_state(NR_FREE_PAGES) << (PAGE_SHIFT - 10); + free_kbytes = global_zone_page_state(NR_FREE_PAGES) << (PAGE_SHIFT - 10); if (sysctl_user_reserve_kbytes > free_kbytes) { init_user_reserve(); diff --git a/mm/nommu.c b/mm/nommu.c index fc184f597d59..53d5175a5c14 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -1962,7 +1962,7 @@ static int __meminit init_user_reserve(void) { unsigned long free_kbytes; - free_kbytes = global_page_state(NR_FREE_PAGES) << (PAGE_SHIFT - 10); + free_kbytes = global_zone_page_state(NR_FREE_PAGES) << (PAGE_SHIFT - 10); sysctl_user_reserve_kbytes = min(free_kbytes / 32, 1UL << 17); return 0; @@ -1983,7 +1983,7 @@ static int __meminit init_admin_reserve(void) { unsigned long free_kbytes; - free_kbytes = global_page_state(NR_FREE_PAGES) << (PAGE_SHIFT - 10); + free_kbytes = global_zone_page_state(NR_FREE_PAGES) << (PAGE_SHIFT - 10); sysctl_admin_reserve_kbytes = min(free_kbytes / 32, 1UL << 13); return 0; diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 8989eada0ef7..c8c3e8e5ddbf 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -363,7 +363,7 @@ static unsigned long global_dirtyable_memory(void) { unsigned long x; - x = global_page_state(NR_FREE_PAGES); + x = global_zone_page_state(NR_FREE_PAGES); /* * Pages reserved for the kernel should not be considered * dirtyable, to prevent a situation where reclaim has to @@ -1405,7 +1405,7 @@ void wb_update_bandwidth(struct bdi_writeback *wb, unsigned long start_time) * will look to see if it needs to start dirty throttling. * * If dirty_poll_interval is too low, big NUMA machines will call the expensive - * global_page_state() too often. So scale it near-sqrt to the safety margin + * global_zone_page_state() too often. So scale it near-sqrt to the safety margin * (the number of pages we may dirty without exceeding the dirty limits). */ static unsigned long dirty_poll_interval(unsigned long dirty, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f582303629e3..ab3d2c475690 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4443,7 +4443,7 @@ long si_mem_available(void) * Estimate the amount of memory available for userspace allocations, * without causing swapping. */ - available = global_page_state(NR_FREE_PAGES) - totalreserve_pages; + available = global_zone_page_state(NR_FREE_PAGES) - totalreserve_pages; /* * Not all the page cache can be freed, otherwise the system will @@ -4472,7 +4472,7 @@ void si_meminfo(struct sysinfo *val) { val->totalram = totalram_pages; val->sharedram = global_node_page_state(NR_SHMEM); - val->freeram = global_page_state(NR_FREE_PAGES); + val->freeram = global_zone_page_state(NR_FREE_PAGES); val->bufferram = nr_blockdev_pages(); val->totalhigh = totalhigh_pages; val->freehigh = nr_free_highpages(); @@ -4607,11 +4607,11 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask) global_node_page_state(NR_SLAB_UNRECLAIMABLE), global_node_page_state(NR_FILE_MAPPED), global_node_page_state(NR_SHMEM), - global_page_state(NR_PAGETABLE), - global_page_state(NR_BOUNCE), - global_page_state(NR_FREE_PAGES), + global_zone_page_state(NR_PAGETABLE), + global_zone_page_state(NR_BOUNCE), + global_zone_page_state(NR_FREE_PAGES), free_pcp, - global_page_state(NR_FREE_CMA_PAGES)); + global_zone_page_state(NR_FREE_CMA_PAGES)); for_each_online_pgdat(pgdat) { if (show_mem_node_skip(filter, pgdat->node_id, nodemask)) diff --git a/mm/util.c b/mm/util.c index 5a5cb6a41da2..84bc17dbfa8f 100644 --- a/mm/util.c +++ b/mm/util.c @@ -590,7 +590,7 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin) return 0; if (sysctl_overcommit_memory == OVERCOMMIT_GUESS) { - free = global_page_state(NR_FREE_PAGES); + free = global_zone_page_state(NR_FREE_PAGES); free += global_node_page_state(NR_FILE_PAGES); /* diff --git a/mm/vmstat.c b/mm/vmstat.c index 9a4441bbeef2..4544d44e9594 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1500,7 +1500,7 @@ static void *vmstat_start(struct seq_file *m, loff_t *pos) if (!v) return ERR_PTR(-ENOMEM); for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) - v[i] = global_page_state(i); + v[i] = global_zone_page_state(i); v += NR_VM_ZONE_STAT_ITEMS; for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) @@ -1589,7 +1589,7 @@ int vmstat_refresh(struct ctl_table *table, int write, * which can equally be echo'ed to or cat'ted from (by root), * can be used to update the stats just before reading them. * - * Oh, and since global_page_state() etc. are so careful to hide + * Oh, and since global_zone_page_state() etc. are so careful to hide * transiently negative values, report an error here if any of * the stats is negative, so we know to go looking for imbalance. */ -- 2.13.2 -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>