On 8/23/23, David Laight <David.Laight@xxxxxxxxxx> wrote: > From: Jan Kara >> Sent: Wednesday, August 23, 2023 10:49 AM > .... >> > --- a/include/linux/mm_types.h >> > +++ b/include/linux/mm_types.h >> > @@ -737,7 +737,11 @@ struct mm_struct { >> > >> > unsigned long saved_auxv[AT_VECTOR_SIZE]; /* for >> > /proc/PID/auxv */ >> > >> > - struct percpu_counter rss_stat[NR_MM_COUNTERS]; >> > + union { >> > + struct percpu_counter rss_stat[NR_MM_COUNTERS]; >> > + u64 *rss_stat_single; >> > + }; >> > + bool magic_flag_stuffed_elsewhere; > > I wouldn't use a union to save a pointer - it is asking for trouble. > I may need to abandon this bit anyway -- counter init adds counters to a global list and I can't call easily call it like that. >> > >> > struct linux_binfmt *binfmt; >> > >> > >> > Then for single-threaded case an area is allocated for NR_MM_COUNTERS >> > countes * 2 -- first set updated without any synchro by current >> > thread. Second set only to be modified by others and protected with >> > mm->arg_lock. The lock protects remote access to the union to begin >> > with. >> >> arg_lock seems a bit like a hack. How is it related to rss_stat? The >> scheme >> with two counters is clever but I'm not 100% convinced the complexity is >> really worth it. I'm not sure the overhead of always using an atomic >> counter would really be measurable as atomic counter ops in local CPU >> cache >> tend to be cheap. Did you try to measure the difference? > > A separate lock is worse than atomics. > (Although some 32bit arch may have issues with 64bit atomics.) > But in my proposal the separate lock is used to facilitate *NOT* using atomics by the most common consumer -- the only thread. The lock is only used for the transition to multithreaded state for updated by remote parties (both rare compared to updated by current). > I think you'll be surprised just how slow atomic ops are. > Even when present in the local cache. > (Probably because any other copies have to be invalidated.) > Agreed. They have always been super expensive on x86-64 (and continue to be). I keep running to claims they are not, I don't know where that's coming from. -- Mateusz Guzik <mjguzik gmail.com>