Re: [PATCH v3 2/2] Add XFS messages to printk index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 30, 2022 at 11:34:57AM +1100, Dave Chinner wrote:
> Hi Petr,
> 
> On Tue, Mar 29, 2022 at 03:34:00PM +0200, Petr Mladek wrote:
> > On Fri 2022-03-25 10:19:46, Jonathan Lassoff wrote:
> > > In order for end users to quickly react to new issues that come up in
> > > production, it is proving useful to leverage the printk indexing system.
> > > This printk index enables kernel developers to use calls to printk()
> > > with changeable ad-hoc format strings, while still enabling end users
> > > to detect changes from release to release.
> > > 
> > > So that detailed XFS messages are captures by this printk index, this
> > > patch wraps the xfs_<level> and xfs_alert_tag functions.
> > > 
> > > Signed-off-by: Jonathan Lassoff <jof@xxxxxxxxxx>
> > 
> > > --- a/fs/xfs/xfs_message.h
> > > +++ b/fs/xfs/xfs_message.h
> > > @@ -6,34 +6,45 @@
> > >  
> > >  struct xfs_mount;
> > >  
> > > +#define xfs_printk_index_wrap(kern_level, mp, fmt, ...)		\
> > > +({								\
> > > +	printk_index_subsys_emit("%sXFS%s: ", kern_level, fmt);	\
> > 
> > I would probably use "%sXFS: " for the first parameter as
> > a compromise here.
> 
> If you really want to be pedantic, the correct format string would
> be "%sXFS (%s):" because of the reasons you state below -  mp &&
> mp->m_super is almost -always- present, so almost all messages are
> going to be emitted in the the (%s) form....
> 
> But that makes me wonder - if the format string doesn't apparently
> need to be an exact match to what is actually output by the kernel,
> then how is this information supposed to be used?
> 
> And so....
> 
> > It affects how the printk formats are shown in debugfs. With the
> > current patch I see in /sys/kernel/debug/printk/index/vmlinux:
> > 
> > <4> fs/xfs/libxfs/xfs_ag.c:877 xfs_ag_shrink_space "%sXFS%s: Error %d reserving per-AG metadata reserve pool."
> > <1> fs/xfs/libxfs/xfs_ag.c:151 xfs_initialize_perag_data "%sXFS%s: AGF corruption. Please run xfs_repair."
> > <4> fs/xfs/libxfs/xfs_alloc.c:2429 xfs_agfl_reset "%sXFS%s: WARNING: Reset corrupted AGFL on AG %u. %d blocks leaked. Please unmount and run xfs_repair."
> > <4> fs/xfs/libxfs/xfs_alloc.c:262 xfs_alloc_get_rec "%sXFS%s: start block 0x%x block count 0x%x"
> > <4> fs/xfs/libxfs/xfs_alloc.c:260 xfs_alloc_get_rec "%sXFS%s: %s Freespace BTree record corruption in AG %d detected!"
> > <1> fs/xfs/libxfs/xfs_attr_remote.c:304 xfs_attr_rmtval_copyout "%sXFS%s: remote attribute header mismatch bno/off/len/owner (0x%llx/0x%x/Ox%x/0x%llx)"
> > <4> fs/xfs/libxfs/xfs_bmap.c:1129 xfs_iread_bmbt_block "%sXFS%s: corrupt dinode %llu, (btree extents)."
> 
> ... onto the implications of explicitly exposing source code
> directly to userspace via a kernel ABI.
> 
> I ask, because user/kernel ABIs are usually fixed and we are not
> allowed to change them in a way that might break userspace. What
> happens when one of these format messages gets moved? What if the
> file, function and line of code all change, but the format string
> stays the same? What about duplicate format strings in different
> files/functions?
> 
> Exactly how is this supposed to be used by userspace? Given that you
> are exposing both the file and the line of the file that the format
> string belongs to, does this mean we can no longer actually move
> this format string to any other location in the source code?
> 
> IOWs, I cannot find anything that documents the implications of
> directly exposing the *raw source code* to userspace though a sysfs
> file on either developers or userspace applications.  Exposing
> anything through a sysfs file usually comes with constraints and
> guarantees and just because it is in /sys/kernel/debug means we can
> waive ABI constraints: I'll refer you to the canonical example of
> tracepoints vs powertop.
> 
> With tracepoints in mind, XFS has an explicit policy that
> tracepoints do not form part of the user ABI because they expose the
> internal implementation directly to userspace. Hence if you use XFS
> tracepoints for any purpose, you get to keep all the broken bits
> when we change/add/remove tracepoints as part of our normal
> development.
> 
> However, for information we explicitly expose via files in proc and
> sysfs (and via ioctls, for that matter), we have to provide explicit
> ABI guarantees, and that means we cannot remove or change the format
> or content of those files in ways that would cause userspace parsers
> and applications to break. If we are removing a proc/sysfs file, we
> have an explicit deprecation process that takes years to run so that
> userspace has time to notice that removal will be occurring and be
> updated not to depend on it by the time we remove it.
> 
> I see no statement anywhere about what this printk index ABI
> requires in terms of code stablility, format string maintenance and
> modification, etc. I've seen it referred to as "semi-stable" but
> there is no clear, easily accessible definition as to what that
> means for either kernel developers or userspace app developers that
> might want to use this information. There's zero information
> available about how userspace will use this information, too, so at
> this point I can't even guess what the policy for this new ABI
> actually is.
> 
> If this was discussed and a policy was created, then great. But it
> *hasn't been documented* for the rest of the world to be able to
> read and understand so they know how to deal safely with the
> information this ABI now provides. So, can you please explain what
> the rules are, and then please write some documentation for the
> kernel admin guide defining the user ABI for application writers and
> what guarantees the kernel provides them with about the contents of
> this ABI.

FWIW if you /did/ accept this for 5.19, I would suggest adding:

printk_index_subsys_emit("XFS log messages shall not be considered a stable kernel ABI as they can change at any time");

I base that largely on the evidence -- there's nothing saying that
catalogued strings are or are not a stable ABI.  That means it's up to
the subsystem or the maintainers or whoever to make a decision, and I
would decide that while some people somewhere might benefit from having
the message catalogue over not having it (e.g. i18n), someone would have
to show a /very/ strong case for letting XFS get powertop'd.

Granted I won't be the maintainer after Sunday and this isn't 5.18
material, so I suppose the ball is in your court. ;)

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux