On Thu, Mar 28, 2024 at 09:46:39AM -1000, Tejun Heo wrote: > Hello, > > On Thu, Mar 28, 2024 at 03:40:02PM -0400, Kent Overstreet wrote: > > Collecting latency numbers at various key places is _enormously_ useful. > > The hard part is deciding where it's useful to collect; that requires > > intimate knowledge of the code. Once you're defining those collection > > poitns statically, doing it with BPF is just another useless layer of > > indirection. > > Given how much flexibility helps with debugging, claiming it useless is a > stretch. Well, what would it add? > > The time stats stuff I wrote is _really_ cheap, and you really want this > > stuff always on so that you've actually got the data you need when > > you're bughunting. > > For some stats and some use cases, always being available is useful and > building fixed infra for them makes sense. For other stats and other use > cases, flexibility is pretty useful too (e.g. what if you want percentile > distribution which is filtered by some criteria?). They aren't mutually > exclusive and I'm not sure bdi wb instrumentation is on top of enough > people's minds. > > As for overhead, BPF instrumentation can be _really_ cheap too. We often run > these programs per packet. The main things I want are just - elapsed time since last writeback IO completed, so we can see at a glance if it's stalled - time stats on writeback io initiation to completion The main value of this one will be tracking down tail latency issues and finding out where in the stack they originate.