Re: [PATCH v3 2/2] Add XFS messages to printk index

Dave Chinner <david@xxxxxxxxxxxxx> · Wed, 30 Mar 2022 11:34:57 +1100

Hi Petr,

On Tue, Mar 29, 2022 at 03:34:00PM +0200, Petr Mladek wrote:
> On Fri 2022-03-25 10:19:46, Jonathan Lassoff wrote:
> > In order for end users to quickly react to new issues that come up in
> > production, it is proving useful to leverage the printk indexing system.
> > This printk index enables kernel developers to use calls to printk()
> > with changeable ad-hoc format strings, while still enabling end users
> > to detect changes from release to release.
> > 
> > So that detailed XFS messages are captures by this printk index, this
> > patch wraps the xfs_<level> and xfs_alert_tag functions.
> > 
> > Signed-off-by: Jonathan Lassoff <jof@xxxxxxxxxx>
> 
> > --- a/fs/xfs/xfs_message.h
> > +++ b/fs/xfs/xfs_message.h
> > @@ -6,34 +6,45 @@
> >  
> >  struct xfs_mount;
> >  
> > +#define xfs_printk_index_wrap(kern_level, mp, fmt, ...)		\
> > +({								\
> > +	printk_index_subsys_emit("%sXFS%s: ", kern_level, fmt);	\
> 
> I would probably use "%sXFS: " for the first parameter as
> a compromise here.

If you really want to be pedantic, the correct format string would
be "%sXFS (%s):" because of the reasons you state below -  mp &&
mp->m_super is almost -always- present, so almost all messages are
going to be emitted in the the (%s) form....

But that makes me wonder - if the format string doesn't apparently
need to be an exact match to what is actually output by the kernel,
then how is this information supposed to be used?

And so....

> It affects how the printk formats are shown in debugfs. With the
> current patch I see in /sys/kernel/debug/printk/index/vmlinux:
> 
> <4> fs/xfs/libxfs/xfs_ag.c:877 xfs_ag_shrink_space "%sXFS%s: Error %d reserving per-AG metadata reserve pool."
> <1> fs/xfs/libxfs/xfs_ag.c:151 xfs_initialize_perag_data "%sXFS%s: AGF corruption. Please run xfs_repair."
> <4> fs/xfs/libxfs/xfs_alloc.c:2429 xfs_agfl_reset "%sXFS%s: WARNING: Reset corrupted AGFL on AG %u. %d blocks leaked. Please unmount and run xfs_repair."
> <4> fs/xfs/libxfs/xfs_alloc.c:262 xfs_alloc_get_rec "%sXFS%s: start block 0x%x block count 0x%x"
> <4> fs/xfs/libxfs/xfs_alloc.c:260 xfs_alloc_get_rec "%sXFS%s: %s Freespace BTree record corruption in AG %d detected!"
> <1> fs/xfs/libxfs/xfs_attr_remote.c:304 xfs_attr_rmtval_copyout "%sXFS%s: remote attribute header mismatch bno/off/len/owner (0x%llx/0x%x/Ox%x/0x%llx)"
> <4> fs/xfs/libxfs/xfs_bmap.c:1129 xfs_iread_bmbt_block "%sXFS%s: corrupt dinode %llu, (btree extents)."

... onto the implications of explicitly exposing source code
directly to userspace via a kernel ABI.

I ask, because user/kernel ABIs are usually fixed and we are not
allowed to change them in a way that might break userspace. What
happens when one of these format messages gets moved? What if the
file, function and line of code all change, but the format string
stays the same? What about duplicate format strings in different
files/functions?

Exactly how is this supposed to be used by userspace? Given that you
are exposing both the file and the line of the file that the format
string belongs to, does this mean we can no longer actually move
this format string to any other location in the source code?

IOWs, I cannot find anything that documents the implications of
directly exposing the *raw source code* to userspace though a sysfs
file on either developers or userspace applications.  Exposing
anything through a sysfs file usually comes with constraints and
guarantees and just because it is in /sys/kernel/debug means we can
waive ABI constraints: I'll refer you to the canonical example of
tracepoints vs powertop.

With tracepoints in mind, XFS has an explicit policy that
tracepoints do not form part of the user ABI because they expose the
internal implementation directly to userspace. Hence if you use XFS
tracepoints for any purpose, you get to keep all the broken bits
when we change/add/remove tracepoints as part of our normal
development.

However, for information we explicitly expose via files in proc and
sysfs (and via ioctls, for that matter), we have to provide explicit
ABI guarantees, and that means we cannot remove or change the format
or content of those files in ways that would cause userspace parsers
and applications to break. If we are removing a proc/sysfs file, we
have an explicit deprecation process that takes years to run so that
userspace has time to notice that removal will be occurring and be
updated not to depend on it by the time we remove it.

I see no statement anywhere about what this printk index ABI
requires in terms of code stablility, format string maintenance and
modification, etc. I've seen it referred to as "semi-stable" but
there is no clear, easily accessible definition as to what that
means for either kernel developers or userspace app developers that
might want to use this information. There's zero information
available about how userspace will use this information, too, so at
this point I can't even guess what the policy for this new ABI
actually is.

If this was discussed and a policy was created, then great. But it
*hasn't been documented* for the rest of the world to be able to
read and understand so they know how to deal safely with the
information this ABI now provides. So, can you please explain what
the rules are, and then please write some documentation for the
kernel admin guide defining the user ABI for application writers and
what guarantees the kernel provides them with about the contents of
this ABI.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx