Re: [PATCH 3/3] memcg: reduce size of memcg vmstats structures

Shakeel Butt <shakeelb@xxxxxxxxxx> · Fri, 9 Sep 2022 10:14:35 -0700

On Thu, Sep 8, 2022 at 5:23 PM Michal Koutný <mkoutny@xxxxxxxx> wrote:
>
> Hello.
>
> On Wed, Sep 07, 2022 at 04:35:37AM +0000, Shakeel Butt <shakeelb@xxxxxxxxxx> wrote:
> >  /* Subset of vm_event_item to report for memcg event stats */
> >  static const unsigned int memcg_vm_event_stat[] = {
> > +     PGPGIN,
> > +     PGPGOUT,
> >       PGSCAN_KSWAPD,
> >       PGSCAN_DIRECT,
> >       PGSTEAL_KSWAPD,
>
> What about adding a dummy entry at the beginning like:
>
>  static const unsigned int memcg_vm_event_stat[] = {
> +       NR_VM_EVENT_ITEMS,
> +       PGPGIN,
> +       PGPGOUT,
>         PGSCAN_KSWAPD,
>         PGSCAN_DIRECT,
>
>
> > @@ -692,14 +694,30 @@ static const unsigned int memcg_vm_event_stat[] = {
> >  #endif
> >  };
> >
> > +#define NR_MEMCG_EVENTS ARRAY_SIZE(memcg_vm_event_stat)
> > +static int mem_cgroup_events_index[NR_VM_EVENT_ITEMS] __read_mostly;
> > +
> > +static void init_memcg_events(void)
> > +{
> > +     int i;
> > +
> > +     for (i = 0; i < NR_MEMCG_EVENTS; ++i)
> > +             mem_cgroup_events_index[memcg_vm_event_stat[i]] = i + 1;
>
> Start such loops from i = 1, save i to the table.
>
> > +}
> > +
> > +static inline int memcg_events_index(enum vm_event_item idx)
> > +{
> > +     return mem_cgroup_events_index[idx] - 1;
> > +}
>
> And the there'd be no need for the reverse transforms -1.
>
> I.e. it might be just a negligible micro-optimization but since the
> event updates are on some fast (albeit longer) paths, it may be worth
> sacrificing one of the saved 8Bs in favor of no arithmetics.
>
> What do you think about this?
>
> >  static unsigned long memcg_events(struct mem_cgroup *memcg, int event)
> >  {
> > -     return READ_ONCE(memcg->vmstats->events[event]);
> > +     int index = memcg_events_index(event);
> > +
> > +     if (index < 0)
> > +             return 0;
>
> As a bonus these undefined maps could use the zero at the dummy location
> without branch (slow paths though).
>
>
> > @@ -5477,7 +5511,7 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu)
> >                       parent->vmstats->state_pending[i] += delta;
> >       }
> >
> > -     for (i = 0; i < NR_VM_EVENT_ITEMS; i++) {
> > +     for (i = 0; i < NR_MEMCG_EVENTS; i++) {
>
> I applaud this part :-)
>
>

Hi Michal,

Thanks for taking a look. Let me get back to you on this later. I am
at the moment rearranging struct mem_cgroup for better packing and
will be running some benchmarks. Later I will see if your suggestion
has any performance benefit or just more readable code then I will
follow up.

Shakeel