Re: [Lsf-pc] [LSF/MM/BPF TOPIC] time to reconsider tracepoints in the vfs?

Jan Kara <jack@xxxxxxx> · Thu, 16 Jan 2025 18:20:15 +0100

On Thu 16-01-25 07:49:49, Theodore Ts'o wrote:
> Historically, we have avoided adding tracepoints to the VFS because of
> concerns that tracepoints would be considered a userspace-level
> interface, and would therefore potentially constrain our ability to
> improve an interface which has been extremely performance critical.
> 
> I'd like to discuss whether in 2025, it's time to reconsider our
> reticence in adding tracepoints in the VFS layer.  First, while there
> has been a single incident of a tracepoint being used by programs that
> were distributed far and wide (powertop) such that we had to revert a
> change to a tracepoint that broke it --- that was ***14** years ago,
> in 2011.  Across multiple other subsystems, many of
> which have added an extensive number of tracepoints, there has been
> only a single problem in over a decade, so I'd like to suggest that
> this concern may have not have been as serious as we had first
> thought.
> 
> In practice, most tracepoints are used by system administrators and
> they have to deal with enough changes that break backwards
> compatibility (e.g., bash 3 ->bash 4, bash 4 -> bash 5, python 2.7 ->
> python 3, etc.) that the ones who really care end up using an
> enterprise distribution, which goes to extreme length to maintain the
> stable ABI nonsense.  Maintaining tracepoints shouldn't be a big deal
> for them.
> 
> Secondly, we've had a very long time to let the dentry interface
> mature, and so (a) the fundamental architecture of the dcache hasn't
> been changing as much in the past few years, and (b) we should have
> enough understanding of the interface to understand where we could put
> tracepoints (e.g., close to the syscall interface) which would make it
> much less likely that there would be any need to make
> backwards-incompatible changes to tracepoints.
> 
> The benefits of this would be to make it much easier for users,
> developers, and kernel developers to use BPF to probe file
> system-related activities.  Today, people who want to do these sorts
> of things need to use fs-specific tracepoints (for example, ext4 has a
> very large number of tracepoints which can be used for this purpose)
> but this locks users into a single file system and makes it harder for
> them to switch to a different file system, or if they want to use
> different file systems for different use cases.
> 
> I'd like to propose that we experiment with adding tracepoints in
> early 2025, so that at the end of the year the year-end 2025 LTS
> kernels will have tracepoints that we are confident will be fit for
> purpose for BPF users.

So I personally have nothing against tracepoints in VFS. Occasionally they
are useful and so far userspace was pretty much accepting the fact that
they are a moving target. That being said with BPF and all the tooling
around it (bcc, bpftrace) userspace has in my experience very much adapted
to just attaching BPF programs to random functions through kprobes so they
are not even relying that much on tracepoints anymore. Just look through
bcc scripts collection... I have myself adopted to a lack of trace points
in VFS by just using kprobes. The learning curve is a bit steeper but after
that it's not a big deal.  I'm watching with a bit of concern developments
like BTF which try to provide some illusion of stability where there isn't
much of it. So some tool could spread wide enough without getting regularly
broken that breaking it will become a problem. But that is not really the
topic of this discussion.

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR