On Sat, Jun 20, 2020 at 02:00:18PM -0700, Andrew Morton wrote: > On Sat, 20 Jun 2020 14:47:19 -0400 Waiman Long <longman@xxxxxxxxxx> wrote: > > > It was found that running the LTP test on a PowerPC system could produce > > erroneous values in /proc/meminfo, like: > > > > MemTotal: 531915072 kB > > MemFree: 507962176 kB > > MemAvailable: 1100020596352 kB > > > > Using bisection, the problem is tracked down to commit 9c315e4d7d8c > > ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()"). > > > > In memcg_uncharge_slab() with a "int order" argument: > > > > unsigned int nr_pages = 1 << order; > > : > > mod_lruvec_state(lruvec, cache_vmstat_idx(s), -nr_pages); > > > > The mod_lruvec_state() function will eventually call the > > __mod_zone_page_state() which accepts a long argument. Depending on > > the compiler and how inlining is done, "-nr_pages" may be treated as > > a negative number or a very large positive number. Apparently, it was > > treated as a large positive number in that PowerPC system leading to > > incorrect stat counts. This problem hasn't been seen in x86-64 yet, > > perhaps the gcc compiler there has some slight difference in behavior. > > > > It is fixed by making nr_pages a signed value. For consistency, a > > similar change is applied to memcg_charge_slab() as well. > > This is somewhat disturbing. > > > --- a/mm/slab.h > > +++ b/mm/slab.h > > @@ -348,7 +348,7 @@ static __always_inline int memcg_charge_slab(struct page *page, > > gfp_t gfp, int order, > > struct kmem_cache *s) > > { > > - unsigned int nr_pages = 1 << order; > > + int nr_pages = 1 << order; > > struct mem_cgroup *memcg; > > struct lruvec *lruvec; > > int ret; > > @@ -388,7 +388,7 @@ static __always_inline int memcg_charge_slab(struct page *page, > > static __always_inline void memcg_uncharge_slab(struct page *page, int order, > > struct kmem_cache *s) > > { > > - unsigned int nr_pages = 1 << order; > > + int nr_pages = 1 << order; > > struct mem_cgroup *memcg; > > struct lruvec *lruvec; > > > > I grabbed the patch, but Roman's "mm: memcg/slab: charge individual > slab objects instead of pages" > (http://lkml.kernel.org/r/20200608230654.828134-10-guro@xxxxxx) deletes > both these functions. It looks like Waiman's patch should be backported to stable. So if you can queue it before my series, that would be nice. > > It replaces the offending code with, afaict, > > > static inline void memcg_slab_free_hook(struct kmem_cache *s, struct page *page, > void *p) > { > struct obj_cgroup *objcg; > unsigned int off; > > if (!memcg_kmem_enabled() || is_root_cache(s)) > return; > > off = obj_to_index(s, page, p); > objcg = page_obj_cgroups(page)[off]; > page_obj_cgroups(page)[off] = NULL; > > obj_cgroup_uncharge(objcg, obj_full_size(s)); > mod_objcg_state(objcg, page_pgdat(page), cache_vmstat_idx(s), > >>> -obj_full_size(s)); > > obj_cgroup_put(objcg); > } > > -obj_full_size() returns size_t so I guess that's OK. > > > > Also > > > static __always_inline void uncharge_slab_page(struct page *page, int order, > struct kmem_cache *s) > { > #ifdef CONFIG_MEMCG_KMEM > if (memcg_kmem_enabled() && !is_root_cache(s)) { > memcg_free_page_obj_cgroups(page); > percpu_ref_put_many(&s->memcg_params.refcnt, 1 << order); > } > #endif > mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s), > >>> -(PAGE_SIZE << order)); > } > > PAGE_SIZE is unsigned long so I guess that's OK as well. > > > Still, perhaps both could be improved. Negating an unsigned scalar is > a pretty ugly thing to do. > > Am I wrong in thinking that all those mod_foo() functions need careful > review? > I'll take a look too. Thanks!