On 16/04/2024 23.36, Tejun Heo wrote:
On Tue, Apr 16, 2024 at 07:51:26PM +0200, Jesper Dangaard Brouer wrote:
This commit enhances the ability to troubleshoot the global
cgroup_rstat_lock by introducing wrapper helper functions for the lock
along with associated tracepoints.
Applied to cgroup/for-6.10.
Thanks for applying the tracepoint patch. I've backported this to our
main production kernels v6.6 LTS (with before mentioned upstream cgroup
work from Yosry and Longman). I have it running in production on two
machines this morning. Doing manual bpftrace script inspection now, but
plan is monitor this continuously (ebpf_exporter[1]) and even have
alerts on excessive wait time on contention.
It makes sense to delay applying the next two patches, until we have
some production experiments with those two patches, and I have fleet
monitoring in place. I'm be offline next week (on dive trip), so I'll
resume work on this 29 April, before I start doing prod experiments.
--Jesper
[1] https://github.com/cloudflare/ebpf_exporter