On Mon, Feb 03, 2020 at 10:25:06AM -0800, Roman Gushchin wrote: > On Mon, Feb 03, 2020 at 12:58:18PM -0500, Johannes Weiner wrote: > > On Mon, Jan 27, 2020 at 09:34:37AM -0800, Roman Gushchin wrote: > > > Currently s8 type is used for per-cpu caching of per-node statistics. > > > It works fine because the overfill threshold can't exceed 125. > > > > > > But if some counters are in bytes (and the next commit in the series > > > will convert slab counters to bytes), it's not gonna work: > > > value in bytes can easily exceed s8 without exceeding the threshold > > > converted to bytes. So to avoid overfilling per-cpu caches and breaking > > > vmstats correctness, let's use s32 instead. > > > > > > This doesn't affect per-zone statistics. There are no plans to use > > > zone-level byte-sized counters, so no reasons to change anything. > > > > Wait, is this still necessary? AFAIU, the node counters will account > > full slab pages, including free space, and only the memcg counters > > that track actual objects will be in bytes. > > > > Can you please elaborate? > > It's weird to have a counter with the same name (e.g. NR_SLAB_RECLAIMABLE_B) > being in different units depending on the accounting scope. > So I do convert all slab counters: global, per-lruvec, > and per-memcg to bytes. Since the node counters tracks allocated slab pages and the memcg counter tracks allocated objects, arguably they shouldn't use the same name anyway. > Alternatively I can fork them, e.g. introduce per-memcg or per-lruvec > NR_SLAB_RECLAIMABLE_OBJ > NR_SLAB_UNRECLAIMABLE_OBJ Can we alias them and reuse their slots? /* Reuse the node slab page counters item for charged objects */ MEMCG_SLAB_RECLAIMABLE = NR_SLAB_RECLAIMABLE, MEMCG_SLAB_UNRECLAIMABLE = NR_SLAB_UNRECLAIMABLE, > and keep global counters untouched. If going this way, I'd prefer to make > them per-memcg, because it will simplify things on charging paths: > now we do get task->mem_cgroup->obj_cgroup in the pre_alloc_hook(), > and then obj_cgroup->mem_cgroup in the post_alloc_hook() just to > bump per-lruvec counters. I don't quite follow. Don't you still have to update the global counters? > Btw, I wonder if we really need per-lruvec counters at all (at least > being enabled by default). For the significant amount of users who > have a single-node machine it doesn't bring anything except performance > overhead. Yeah, for single-node systems we should be able to redirect everything to the memcg counters, without allocating and tracking lruvec copies. > For those who have multiple nodes (and most likely many many > memory cgroups) it provides way too many data except for debugging > some weird mm issues. > I guess in the absolute majority of cases having global per-node + per-memcg > counters will be enough. Hm? Reclaim uses the lruvec counters.