On Thu, Mar 31, 2022 at 04:09:23PM +0200, Petr Mladek wrote: > On Thu 2022-03-31 08:02:19, Dave Chinner wrote: > > On Wed, Mar 30, 2022 at 12:47:39PM -0400, Steven Rostedt wrote: > > > On Wed, 30 Mar 2022 12:52:58 +0100 > > > Chris Down <chris@xxxxxxxxxxxxxx> wrote: > > > > > > > The policy, as with all debugfs APIs by default, is that it's completely > > > > unstable and there are no API stability guarantees whatsoever. That's why > > > > there's no extensive documentation for users: because this is a feature for > > > > kernel developers. > > > > > > > > 0: https://lwn.net/Articles/309298/ > > > > > > That article you reference states the opposite of what you said. And I got > > > burnt by it before. Because Linus stated, if it is available for users, it > > > is an ABI. > > > > > > From the article above: > > > > > > "Linus put it this way: > > > > > > The fact that something is documented (whether correctly or not) has > > > absolutely _zero_ impact on anything at all. What makes something an ABI is > > > that it's useful and available. The only way something isn't an ABI is by > > > _explicitly_ making sure that it's not available even by mistake in a > > > stable form for binary use. Example: kernel internal data structures and > > > function calls. We make sure that you simply _cannot_ make a binary that > > > works across kernel versions. That is the only way for an ABI to not form." > > > > > > IOW, files in debugfs are available for users, and if something is written > > > that depends on it and it is useful, it becomes ABI. > > > > Yup, that's exactly what happened with powertop and the tracepoints > > it used and why I pointed to it as is the canonical example of > > information exposed from within debugfs unintentionally becoming > > stable KABI.... > > To be sure that we are on the same page. > > Please, fix me if I am wrong. I am not that familiar with tracepoints. > It is a rather complex feature. Each tracepoint has a name, arguments, > fields, prints a message. I guess that the KABI aspects are: > > + name of the tracepoint > + situation when they are triggered > + names, type, and meaning of the parameters/fields > + format and meaning or the printed messages These -aren't- things that make up the tracepoint KABI - this is the -data- that the tracepoint infrastructure generates. This data contains a *lot* of information about the internal implementation of a subsystem. e.g. there are over 550 individual tracepoints in XFS that span every single subsystem from IO paths to allocation to log space reseverations. > In compare, a potential KABI aspects of a particular printk format > (message) are: > > + situation when it is printed > + format and meaning of the printed message Again, I see this as the data being generated by the printk index, not the KABI defined for ensuring access to the data doesn't change. > They clearly have something in common. I guess that this causes the > fear that the printk index feature might make convert a particular > printk message into KABI. > > Note that the above summary is not talking about debugfs at all. > Is it really debugfs what made tracepoints considered a KABI? > Are tracepoints usable without debugfs? No, but.... A large number of the tracepoints in XFS come from Irix kernel debugging infrastructure from the early/mid 1990s. Irix had a built in kernel debugger (idb) and kernel crash dump tools that could also run on a live kernel (icrash). XFS had code in it to add commands to both of these tools to iterate the tracing events that were built into the code. commit 8a2bc927ff399dff08d4242c8cec9cb33e31eac2 Author: Doug Doucette <doucette@xxxxxxxxxxxx> Date: Mon May 9 04:38:21 1994 +0000 Add a bunch of tracing code for bmap btrees. That used generic, built in kernel tracing infrastructure that the Irix kernel and kernel debugger provided developers, and that was back in 1994. When XFS was ported to Linux, SGI also ported idb and icrash to linux - idg became kdb, and icrash is waht we now know as "crash". The XFS CVS tree carried kdb patches and all the interfacing code to add the tracing output commands to kdb. Then, eventually, tracepoints came along and we did a macro conversion of the original XFS tracing code to the new tracepoint infrastructure. IOWs, the tracing events we export via tracepoints in XFS has a long history of existence before the linux kernel tracepoint infrastructure ever existed, hence the method of extracting the tracing data from the kernel doesn't magically make the *tracing data* part of the kernel ABI. The tracepoint KABI covers the debugfs interface and file formats/protocols - the thing that applications like trace-cmd, perf, PCP, etc interface with to configure and extract tracepoint data from the kernel. Those binaries need to work across multiple kernel versions to be cause to control and extract tracepoint -data-. The data itself can change from kernel to kernel without those tools breaking, but the data format, the debugfs control interfaces, etc must all remain unchaged (or at least backward compatible). > It is clear that the debugfs interface might be KABI on its own. > There are many tools that use the interface to actually use > the tracepoints. A change in the interface might break the tools. > But it will be about the interface and not about the particular > tracepoints. Right. Here's the problem with powertop, from 2010: https://lwn.net/Articles/442113/ It hard coded a dependency on a *specific set of data* that a specific tracepoint exported, and so when that tracepoint changed powertop then broke. The article discusses how this made the tracepoint data part of the KABI, then read what I said about that idea w.r.t. XFS: https://lwn.net/Articles/442340/ Now compare that situation to the concerns I raised about the printk index. That is, I'm concerned that the -data- that is exported through this new printk indexing KABI in debugfs will get retconned as KABI. That's the same concerns I raised with tracepoints way back then and history has shown that I was right. i.e. that tracepoint -data- should never be considered part of the KABI. I don't want to have to spend the next 10 years making the same arguments about the XFS printk index *data* not being KABI as we had with tracepoint data. > But I think that tracepoints are KABI even without the debugfs > interface. We could create 10 different interfaces for tracepoints > between the kernel and userspace. And all will break userspace > if the functionality of a tracepoint is modified. Yes, the -control interface- is covered by KABI. However, the -data- that is extracted through that control interface is not covered by KABI - we can and do change that at will and give no guarantees about the consistency or stability of the data from kernel to kernel. The printk index has the same concerns - the debugfs interface has to conform to KABI rules, otherwise we break applications. The -data- that it exposes, OTOH, is tightly tied to internal implementation details and so must not be tied to KABI. It must be allowed to chagne at will and applications need to consider it to be unstable from kernel to kernel and use it appropriately. > I want to say. IMHO, it is is not debugfs what made KABI from > tracepoints. I think that tracepoints can be considered KABI on > its own. The tracepoints were created together with the debugfs > interface. They would not make any sense without each other. > > This is not the case for printk() messages. They were always there. So was the internal XFS tracing infrastructure and trace events. :) > The printk index is not an interface for using the messages. > It is like /proc/config.gz. The printk index describes what > pieces are available in the kernel. > > IMHO, printk messages might already be considered KABI. There > are clearly monitors checking particular messages. The printk index > does not make any difference. Yes, it might be used to create a KABI > checker. But a KABI checker does not create the KABI. KABI > checkers exist only because something has already been considered > KABI before. AFAIA, printk messages have never been part of the KABI. Anyone who writes a log scraper knows that messages can and do change over time and that's their problem to deal with. As kernel developers we give -zero- regard to KABI when writing, modifying or removing log messages. That is how it should be, and this directly indicates that the printk index -data- is not KABI, either. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx