[LSF/MM/BPF TOPIC] Debugging kernel lock performance using BPF

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Debugging kernel performance bugs is usually a hard task. In this
talk, I would like to share my recent experience of using BPF to debug
a performance regression caused by kernel lock contentions. I am going
to introduce the BPF programs that I used to successfully root cause a
regression of kernel operation latencies that only occurs in our
production environment. Traditional profiling and monitoring tools
fail to capture relevant information for me to debug this issue, but
BPF tracing programs provide a sufficient set of facilities that turn
out to be very helpful. They are made possible by the modern BPF
features, including:

 - Lock contention tracepoints [1]
 - BPF timer [2]
 - CO-RE kernel type matching APIs [3]
 - BPF static linking [4]

Looking forward, as we will see more heterogeneous platforms,
profiling kernel synchronization performances and gaining better
understanding are going to become more important nowadays. From my
experience, I would like to open discussions for support of better
observability of kernel performances, including getting full lock
owner information [5] and more reliably getting Build ID stacktraces
[6].

[1] https://lore.kernel.org/bpf/20220322185709.141236-1-namhyung@xxxxxxxxxx/
[2] https://lwn.net/Articles/862136/
[3] https://lwn.net/Articles/898839/
[4] https://lwn.net/Articles/848997/
[5] https://lore.kernel.org/lkml/20230207002403.63590-1-namhyung@xxxxxxxxxx/
[6] https://lore.kernel.org/bpf/Y9vX49CtDzyg3B%2F8@krava/T/



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux