On Wed, May 01, 2024 at 04:04:11PM +0200, Jesper Dangaard Brouer wrote: > This closely resembles helpers added for the global cgroup_rstat_lock in > commit fc29e04ae1ad ("cgroup/rstat: add cgroup_rstat_lock helpers and > tracepoints"). This is for the per CPU lock cgroup_rstat_cpu_lock. > > Based on production workloads, we observe the fast-path "update" function > cgroup_rstat_updated() is invoked around 3 million times per sec, while the > "flush" function cgroup_rstat_flush_locked(), walking each possible CPU, > can see periodic spikes of 700 invocations/sec. > > For this reason, the tracepoints are split into normal and fastpath > versions for this per-CPU lock. Making it feasible for production to > continuously monitor the non-fastpath tracepoint to detect lock contention > issues. The reason for monitoring is that lock disables IRQs which can > disturb e.g. softirq processing on the local CPUs involved. When the > global cgroup_rstat_lock stops disabling IRQs (e.g converted to a mutex), > this per CPU lock becomes the next bottleneck that can introduce latency > variations. > > A practical bpftrace script for monitoring contention latency: > > bpftrace -e ' > tracepoint:cgroup:cgroup_rstat_cpu_lock_contended { > @start[tid]=nsecs; @cnt[probe]=count()} > tracepoint:cgroup:cgroup_rstat_cpu_locked { > if (args->contended) { > @wait_ns=hist(nsecs-@start[tid]); delete(@start[tid]);} > @cnt[probe]=count()} > interval:s:1 {time("%H:%M:%S "); print(@wait_ns); print(@cnt); clear(@cnt);}' > > Signed-off-by: Jesper Dangaard Brouer <hawk@xxxxxxxxxx> Applied to cgroup/for-6.10. Thanks. -- tejun