On Mon, Mar 03, 2025 at 11:23:49AM -0800, JP Kobryn wrote: > > > On 3/3/25 10:40 AM, Shakeel Butt wrote: > > On Mon, Mar 03, 2025 at 06:29:53PM +0000, Yosry Ahmed wrote: > > > On Mon, Mar 03, 2025 at 04:22:42PM +0100, Michal Koutný wrote: > > > > On Thu, Feb 27, 2025 at 01:55:42PM -0800, inwardvessel <inwardvessel@xxxxxxxxx> wrote: > > > > > From: JP Kobryn <inwardvessel@xxxxxxxxx> > > > > ... > > > > > +static inline bool is_base_css(struct cgroup_subsys_state *css) > > > > > +{ > > > > > + return css->ss == NULL; > > > > > +} > > > > > > > > Similar predicate is also used in cgroup.c (various cgroup vs subsys > > > > lifecycle functions, e.g. css_free_rwork_fn()). I think it'd better > > > > unified, i.e. open code the predicate here or use the helper in both > > > > cases (css_is_cgroup() or similar). > > > > > > > > > void __init cgroup_rstat_boot(void) > > > > > { > > > > > - int cpu; > > > > > + struct cgroup_subsys *ss; > > > > > + int cpu, ssid; > > > > > - for_each_possible_cpu(cpu) > > > > > - raw_spin_lock_init(per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)); > > > > > + for_each_subsys(ss, ssid) { > > > > > + spin_lock_init(&cgroup_rstat_subsys_lock[ssid]); > > > > > + } > > > > > > > > Hm, with this loop I realize it may be worth putting this lock into > > > > struct cgroup_subsys_state and initializing them in > > > > cgroup_init_subsys() to keep all per-subsys data in one pack. > > > > > > I thought about this, but this would have unnecessary memory overhead as > > > we only need one lock per-subsystem. So having a lock in every single > > > css is wasteful. > > > > > > Maybe we can put the lock in struct cgroup_subsys? Then we can still > > > initialize them in cgroup_init_subsys(). > > > > > > > Actually one of things I was thinking about if we can just not have > > per-subsystem lock at all. At the moment, it is protecting > > rstat_flush_next field (today in cgroup and JP's series it is in css). > > What if we make it a per-cpu then we don't need the per-subsystem lock > > all? Let me know if I missed something which is being protected by this > > lock. > > > > This is help the case where there are multiple same subsystem stat > > flushers, possibly of differnt part of cgroup tree. Though they will > > still compete on per-cpu lock but still would be better than a > > sub-system level lock. > > Right, the trade-off would mean one subsystem flushing could contend for > a cpu where a different subsystem is updating and vice versa. > I meant we keep the per-subsystem per-cpu locks but remove the per-subsystem lock (i.e. cgroup_rstat_subsys_lock) if we make rstat_flush_next per-cpu. In that case two memory stats readers will not compete on memory subsystem lock but they will compete on per-cpu memory locks. Though we will have to experiment to see if this really helps or not.