Re: [PATCH v3 2/2] Add XFS messages to printk index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 31, 2022 at 04:09:23PM +0200, Petr Mladek wrote:
> On Thu 2022-03-31 08:02:19, Dave Chinner wrote:
> > On Wed, Mar 30, 2022 at 12:47:39PM -0400, Steven Rostedt wrote:
> > > On Wed, 30 Mar 2022 12:52:58 +0100
> > > Chris Down <chris@xxxxxxxxxxxxxx> wrote:
> > > 
> > > > The policy, as with all debugfs APIs by default, is that it's completely 
> > > > unstable and there are no API stability guarantees whatsoever. That's why 
> > > > there's no extensive documentation for users: because this is a feature for 
> > > > kernel developers.
> > > > 
> > > > 0: https://lwn.net/Articles/309298/
> > > 
> > > That article you reference states the opposite of what you said. And I got
> > > burnt by it before. Because Linus stated, if it is available for users, it
> > > is an ABI.
> > > 
> > > From the article above:
> > > 
> > > "Linus put it this way:
> > > 
> > >    The fact that something is documented (whether correctly or not) has
> > >    absolutely _zero_ impact on anything at all. What makes something an ABI is
> > >    that it's useful and available. The only way something isn't an ABI is by
> > >    _explicitly_ making sure that it's not available even by mistake in a
> > >    stable form for binary use. Example: kernel internal data structures and
> > >    function calls. We make sure that you simply _cannot_ make a binary that
> > >    works across kernel versions. That is the only way for an ABI to not form."
> > > 
> > > IOW, files in debugfs are available for users, and if something is written
> > > that depends on it and it is useful, it becomes ABI.
> > 
> > Yup, that's exactly what happened with powertop and the tracepoints
> > it used and why I pointed to it as is the canonical example of
> > information exposed from within debugfs unintentionally becoming
> > stable KABI....
> 
> To be sure that we are on the same page.
> 
> Please, fix me if I am wrong. I am not that familiar with tracepoints.
> It is a rather complex feature. Each tracepoint has a name, arguments,
> fields, prints a message. I guess that the KABI aspects are:
> 
>     + name of the tracepoint
>     + situation when they are triggered
>     + names, type, and meaning of the parameters/fields
>     + format and meaning or the printed messages

These -aren't- things that make up the tracepoint KABI - this is the
-data- that the tracepoint infrastructure generates. This data
contains a *lot* of information about the internal implementation of
a subsystem. e.g. there are over 550 individual tracepoints in XFS
that span every single subsystem from IO paths to allocation to log
space reseverations.

> In compare, a potential KABI aspects of a particular printk format
> (message) are:
> 
>     + situation when it is printed
>     + format and meaning of the printed message

Again, I see this as the data being generated by the printk index,
not the KABI defined for ensuring access to the data doesn't change.

> They clearly have something in common. I guess that this causes the
> fear that the printk index feature might make convert a particular
> printk message into KABI.
> 
> Note that the above summary is not talking about debugfs at all.
> Is it really debugfs what made tracepoints considered a KABI?
> Are tracepoints usable without debugfs?

No, but....

A large number of the tracepoints in XFS come from Irix kernel
debugging infrastructure from the early/mid 1990s. Irix had a built
in kernel debugger (idb) and kernel crash dump tools that could also
run on a live kernel (icrash). XFS had code in it to add commands to
both of these tools to iterate the tracing events that were built
into the code.

commit 8a2bc927ff399dff08d4242c8cec9cb33e31eac2
Author: Doug Doucette <doucette@xxxxxxxxxxxx>
Date:   Mon May 9 04:38:21 1994 +0000

    Add a bunch of tracing code for bmap btrees.

That used generic, built in kernel tracing infrastructure that the
Irix kernel and kernel debugger provided developers, and that was
back in 1994.

When XFS was ported to Linux, SGI also ported idb and icrash to
linux - idg became kdb, and icrash is waht we now know as "crash".
The XFS CVS tree carried kdb patches and all the interfacing code
to add the tracing output commands to kdb.

Then, eventually, tracepoints came along and we did a macro
conversion of the original XFS tracing code to the new tracepoint
infrastructure.

IOWs, the tracing events we export via tracepoints in XFS has a long
history of existence before the linux kernel tracepoint
infrastructure ever existed, hence the method of extracting the
tracing data from the kernel doesn't magically make the *tracing
data* part of the kernel ABI.

The tracepoint KABI covers the debugfs interface and file
formats/protocols - the thing that applications like trace-cmd,
perf, PCP, etc interface with to configure and extract tracepoint
data from the kernel. Those binaries need to work across multiple
kernel versions to be cause to control and extract tracepoint
-data-. The data itself can change from kernel to kernel without
those tools breaking, but the data format, the debugfs control
interfaces, etc must all remain unchaged (or at least backward
compatible).

> It is clear that the debugfs interface might be KABI on its own.
> There are many tools that use the interface to actually use
> the tracepoints. A change in the interface might break the tools.
> But it will be about the interface and not about the particular
> tracepoints.

Right. Here's the problem with powertop, from 2010:

https://lwn.net/Articles/442113/

It hard coded a dependency on a *specific set of data* that a
specific tracepoint exported, and so when that tracepoint changed
powertop then broke. The article discusses how this made the
tracepoint data part of the KABI, then read what I said about
that idea w.r.t. XFS:

https://lwn.net/Articles/442340/

Now compare that situation to the concerns I raised about the printk
index.

That is, I'm concerned that the -data- that is exported through this
new printk indexing KABI in debugfs will get retconned as KABI.
That's the same concerns I raised with tracepoints way back then and
history has shown that I was right. i.e. that tracepoint -data-
should never be considered part of the KABI. I don't want to have to
spend the next 10 years making the same arguments about the XFS
printk index *data* not being KABI as we had with tracepoint data.

> But I think that tracepoints are KABI even without the debugfs
> interface. We could create 10 different interfaces for tracepoints
> between the kernel and userspace. And all will break userspace
> if the functionality of a tracepoint is modified.

Yes, the -control interface- is covered by KABI. However, the -data-
that is extracted through that control interface is not covered by
KABI - we can and do change that at will and give no guarantees
about the consistency or stability of the data from kernel to
kernel.

The printk index has the same concerns - the debugfs interface has
to conform to KABI rules, otherwise we break applications. The
-data- that it exposes, OTOH, is tightly tied to internal
implementation details and so must not be tied to KABI. It must be
allowed to chagne at will and applications need to consider it to be
unstable from kernel to kernel and use it appropriately.

> I want to say. IMHO, it is is not debugfs what made KABI from
> tracepoints. I think that tracepoints can be considered KABI on
> its own. The tracepoints were created together with the debugfs
> interface. They would not make any sense without each other.
> 
> This is not the case for printk() messages. They were always there.

So was the internal XFS tracing infrastructure and trace events. :)

> The printk index is not an interface for using the messages.
> It is like /proc/config.gz. The printk index describes what
> pieces are available in the kernel.
>
> IMHO, printk messages might already be considered KABI. There
> are clearly monitors checking particular messages. The printk index
> does not make any difference. Yes, it might be used to create a KABI
> checker. But a KABI checker does not create the KABI. KABI
> checkers exist only because something has already been considered
> KABI before.

AFAIA, printk messages have never been part of the KABI. Anyone who
writes a log scraper knows that messages can and do change over time
and that's their problem to deal with. As kernel developers we give
-zero- regard to KABI when writing, modifying or removing log
messages. That is how it should be, and this directly indicates that
the printk index -data- is not KABI, either.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux