From: Jan Kara > Sent: Wednesday, August 23, 2023 10:49 AM .... > > --- a/include/linux/mm_types.h > > +++ b/include/linux/mm_types.h > > @@ -737,7 +737,11 @@ struct mm_struct { > > > > unsigned long saved_auxv[AT_VECTOR_SIZE]; /* for > > /proc/PID/auxv */ > > > > - struct percpu_counter rss_stat[NR_MM_COUNTERS]; > > + union { > > + struct percpu_counter rss_stat[NR_MM_COUNTERS]; > > + u64 *rss_stat_single; > > + }; > > + bool magic_flag_stuffed_elsewhere; I wouldn't use a union to save a pointer - it is asking for trouble. > > > > struct linux_binfmt *binfmt; > > > > > > Then for single-threaded case an area is allocated for NR_MM_COUNTERS > > countes * 2 -- first set updated without any synchro by current > > thread. Second set only to be modified by others and protected with > > mm->arg_lock. The lock protects remote access to the union to begin > > with. > > arg_lock seems a bit like a hack. How is it related to rss_stat? The scheme > with two counters is clever but I'm not 100% convinced the complexity is > really worth it. I'm not sure the overhead of always using an atomic > counter would really be measurable as atomic counter ops in local CPU cache > tend to be cheap. Did you try to measure the difference? A separate lock is worse than atomics. (Although some 32bit arch may have issues with 64bit atomics.) I think you'll be surprised just how slow atomic ops are. Even when present in the local cache. (Probably because any other copies have to be invalidated.) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)