On Wed, Apr 03, 2024 at 08:44:05AM -1000, Tejun Heo wrote: > Hello, > > On Wed, Apr 03, 2024 at 06:27:16PM +0200, Jan Kara wrote: > > Yeah, BPF is great and I use it but to fill in some cases from practice, > > there are sysadmins refusing to install bcc or run your BPF scripts on > > their systems due to company regulations, their personal fear, or whatever. > > So debugging with what you can achieve from a shell is still the thing > > quite often. > > Yeah, I mean, this happens with anything new. Tracing itself took quite a > while to be adopted widely. BPF, bcc, bpftrace are all still pretty new and > it's likely that the adoption line will keep shifting for quite a while. > Besides, even with all the new gizmos there definitely are cases where good > ol' cat interface makes sense. > > So, if the static interface makes sense, we add it but we should keep in > mind that the trade-offs for adding such static infrastructure, especially > for the ones which aren't *widely* useful, are rather quickly shfiting in > the less favorable direction. A lot of our static debug infrastructure isn't that useful because it just sucks. Every time I hit a sysfs or procfs file that's just a single integer, and nothing else, when clearly there's internal structure and description that needs to be there I die a little inside. It's lazy and amateurish. I regularly debug things in bcachefs over IRC in about 5-10 minutes of asking to check various files and pastebin them - this is my normal process, I pretty much never have to ssh and touch the actual machines. That's how it should be if you just make a point of making your internal state easy to view and introspect, but when I'm debugging issues that run into the wider block layer, or memory reclaim, we often hit a wall. Writeback throttling was buggy for _months_, no visibility or introspection or concerns for debugging, and that's a small chunk of code. io_uring - had to disable it. I _still_ have people bringing issues to me that are clearly memory reclaim related but I don't have the tools. It's not like any of this code exports much in the way of useful tracepoints either, but tracepoints often just aren't what you want; what you want just to be able to see internal state (_without_ having to use a debugger, because that's completely impractical outside highly controlled environments) - and tracing is also never the first thing you want to reach for when you have a user asking you "hey, this thing went wonky, what's it doing?" - tracing automatically turns it into a multi step process of decide what you want to look at, run the workload more to collect data, iterate. Think more about "what would make code easier to debug" and less about "how do I shove this round peg through the square tracing/BPF slot". There's _way_ more we could be doing that would just make our lives easier.