Hello. On Wed, Sep 07, 2022 at 04:35:37AM +0000, Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > /* Subset of vm_event_item to report for memcg event stats */ > static const unsigned int memcg_vm_event_stat[] = { > + PGPGIN, > + PGPGOUT, > PGSCAN_KSWAPD, > PGSCAN_DIRECT, > PGSTEAL_KSWAPD, What about adding a dummy entry at the beginning like: static const unsigned int memcg_vm_event_stat[] = { + NR_VM_EVENT_ITEMS, + PGPGIN, + PGPGOUT, PGSCAN_KSWAPD, PGSCAN_DIRECT, > @@ -692,14 +694,30 @@ static const unsigned int memcg_vm_event_stat[] = { > #endif > }; > > +#define NR_MEMCG_EVENTS ARRAY_SIZE(memcg_vm_event_stat) > +static int mem_cgroup_events_index[NR_VM_EVENT_ITEMS] __read_mostly; > + > +static void init_memcg_events(void) > +{ > + int i; > + > + for (i = 0; i < NR_MEMCG_EVENTS; ++i) > + mem_cgroup_events_index[memcg_vm_event_stat[i]] = i + 1; Start such loops from i = 1, save i to the table. > +} > + > +static inline int memcg_events_index(enum vm_event_item idx) > +{ > + return mem_cgroup_events_index[idx] - 1; > +} And the there'd be no need for the reverse transforms -1. I.e. it might be just a negligible micro-optimization but since the event updates are on some fast (albeit longer) paths, it may be worth sacrificing one of the saved 8Bs in favor of no arithmetics. What do you think about this? > static unsigned long memcg_events(struct mem_cgroup *memcg, int event) > { > - return READ_ONCE(memcg->vmstats->events[event]); > + int index = memcg_events_index(event); > + > + if (index < 0) > + return 0; As a bonus these undefined maps could use the zero at the dummy location without branch (slow paths though). > @@ -5477,7 +5511,7 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu) > parent->vmstats->state_pending[i] += delta; > } > > - for (i = 0; i < NR_VM_EVENT_ITEMS; i++) { > + for (i = 0; i < NR_MEMCG_EVENTS; i++) { I applaud this part :-) Michal