On Thu, Sep 8, 2022 at 5:23 PM Michal Koutný <mkoutny@xxxxxxxx> wrote: > > Hello. > > On Wed, Sep 07, 2022 at 04:35:37AM +0000, Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > /* Subset of vm_event_item to report for memcg event stats */ > > static const unsigned int memcg_vm_event_stat[] = { > > + PGPGIN, > > + PGPGOUT, > > PGSCAN_KSWAPD, > > PGSCAN_DIRECT, > > PGSTEAL_KSWAPD, > > What about adding a dummy entry at the beginning like: > > static const unsigned int memcg_vm_event_stat[] = { > + NR_VM_EVENT_ITEMS, > + PGPGIN, > + PGPGOUT, > PGSCAN_KSWAPD, > PGSCAN_DIRECT, > > > > @@ -692,14 +694,30 @@ static const unsigned int memcg_vm_event_stat[] = { > > #endif > > }; > > > > +#define NR_MEMCG_EVENTS ARRAY_SIZE(memcg_vm_event_stat) > > +static int mem_cgroup_events_index[NR_VM_EVENT_ITEMS] __read_mostly; > > + > > +static void init_memcg_events(void) > > +{ > > + int i; > > + > > + for (i = 0; i < NR_MEMCG_EVENTS; ++i) > > + mem_cgroup_events_index[memcg_vm_event_stat[i]] = i + 1; > > Start such loops from i = 1, save i to the table. > > > +} > > + > > +static inline int memcg_events_index(enum vm_event_item idx) > > +{ > > + return mem_cgroup_events_index[idx] - 1; > > +} > > And the there'd be no need for the reverse transforms -1. > > I.e. it might be just a negligible micro-optimization but since the > event updates are on some fast (albeit longer) paths, it may be worth > sacrificing one of the saved 8Bs in favor of no arithmetics. > > What do you think about this? > > > static unsigned long memcg_events(struct mem_cgroup *memcg, int event) > > { > > - return READ_ONCE(memcg->vmstats->events[event]); > > + int index = memcg_events_index(event); > > + > > + if (index < 0) > > + return 0; > > As a bonus these undefined maps could use the zero at the dummy location > without branch (slow paths though). > > > > @@ -5477,7 +5511,7 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu) > > parent->vmstats->state_pending[i] += delta; > > } > > > > - for (i = 0; i < NR_VM_EVENT_ITEMS; i++) { > > + for (i = 0; i < NR_MEMCG_EVENTS; i++) { > > I applaud this part :-) > > Hi Michal, Thanks for taking a look. Let me get back to you on this later. I am at the moment rearranging struct mem_cgroup for better packing and will be running some benchmarks. Later I will see if your suggestion has any performance benefit or just more readable code then I will follow up. Shakeel