On Sat, 20 Jun 2020 14:47:19 -0400 Waiman Long <longman@xxxxxxxxxx> wrote: > It was found that running the LTP test on a PowerPC system could produce > erroneous values in /proc/meminfo, like: > > MemTotal: 531915072 kB > MemFree: 507962176 kB > MemAvailable: 1100020596352 kB > > Using bisection, the problem is tracked down to commit 9c315e4d7d8c > ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()"). > > In memcg_uncharge_slab() with a "int order" argument: > > unsigned int nr_pages = 1 << order; > : > mod_lruvec_state(lruvec, cache_vmstat_idx(s), -nr_pages); > > The mod_lruvec_state() function will eventually call the > __mod_zone_page_state() which accepts a long argument. Depending on > the compiler and how inlining is done, "-nr_pages" may be treated as > a negative number or a very large positive number. Apparently, it was > treated as a large positive number in that PowerPC system leading to > incorrect stat counts. This problem hasn't been seen in x86-64 yet, > perhaps the gcc compiler there has some slight difference in behavior. > > It is fixed by making nr_pages a signed value. For consistency, a > similar change is applied to memcg_charge_slab() as well. This is somewhat disturbing. > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -348,7 +348,7 @@ static __always_inline int memcg_charge_slab(struct page *page, > gfp_t gfp, int order, > struct kmem_cache *s) > { > - unsigned int nr_pages = 1 << order; > + int nr_pages = 1 << order; > struct mem_cgroup *memcg; > struct lruvec *lruvec; > int ret; > @@ -388,7 +388,7 @@ static __always_inline int memcg_charge_slab(struct page *page, > static __always_inline void memcg_uncharge_slab(struct page *page, int order, > struct kmem_cache *s) > { > - unsigned int nr_pages = 1 << order; > + int nr_pages = 1 << order; > struct mem_cgroup *memcg; > struct lruvec *lruvec; > I grabbed the patch, but Roman's "mm: memcg/slab: charge individual slab objects instead of pages" (http://lkml.kernel.org/r/20200608230654.828134-10-guro@xxxxxx) deletes both these functions. It replaces the offending code with, afaict, static inline void memcg_slab_free_hook(struct kmem_cache *s, struct page *page, void *p) { struct obj_cgroup *objcg; unsigned int off; if (!memcg_kmem_enabled() || is_root_cache(s)) return; off = obj_to_index(s, page, p); objcg = page_obj_cgroups(page)[off]; page_obj_cgroups(page)[off] = NULL; obj_cgroup_uncharge(objcg, obj_full_size(s)); mod_objcg_state(objcg, page_pgdat(page), cache_vmstat_idx(s), >>> -obj_full_size(s)); obj_cgroup_put(objcg); } -obj_full_size() returns size_t so I guess that's OK. Also static __always_inline void uncharge_slab_page(struct page *page, int order, struct kmem_cache *s) { #ifdef CONFIG_MEMCG_KMEM if (memcg_kmem_enabled() && !is_root_cache(s)) { memcg_free_page_obj_cgroups(page); percpu_ref_put_many(&s->memcg_params.refcnt, 1 << order); } #endif mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s), >>> -(PAGE_SIZE << order)); } PAGE_SIZE is unsigned long so I guess that's OK as well. Still, perhaps both could be improved. Negating an unsigned scalar is a pretty ugly thing to do. Am I wrong in thinking that all those mod_foo() functions need careful review?