[PATCHES][RFC][CFT] debugfs cleanups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



	Debugfs has several unpleasant problems, all with the same root
cause - use of bare struct inode.  Most of the filesystems embed struct
inode into fs-specific objects used to store whatever extra state is
needed for that filesystem.  struct inode itself has one opaque pointer
(->i_private), and for simple cases that's enough; unfortunately,
debugfs case is not that simple.

	A debugfs file needs to carry at least the following:
* which driver-supplied callbacks should be used by read/write/etc.
* which driver-supplied object should they act upon, if driver has
several objects of the same kind (a very common situation)

	What's more, since driver may remove the underlying object, we
need some exclusion between the methods and removal; that is to say, for
anything opened we need to keep track of the number of threads currently
in driver-supplied callbacks and we need some way for removal to make
sure that no further callbacks will be attempted and to wait for the
one in progress to finish.

	Worse yet, if we have several similar files for access to
different fields of the same driver's object, we need either a separate
set of callbacks for each field, or some way for a callback to tell
which field it's being used for.

	Result is kludges galore.  The pointer to driver's object is
stored in inode->i_private.  The pointer to use-tracking state (struct
debugfs_fsdata) is stored in dentry->d_fsdata...  and we have run out
of opaque fields.

	The pointer to driver-supplied callback table has nowhere to go -
we can't store it in ->i_fop (that points to debugfs wrappers that would
handle use counts and call the real driver-supplied callbacks).  OK, that
can be kludged around - we allocate debugfs_fsdata on the first use,
and we can keep that pointer in it afterwards.	So we have it stashed
into ->d_fsdata until that point and move it over into debugfs_fsdata
afterwards.  Since pointers are aligned, we can use the lowest bit to
tell one from another.

	That has grown even nastier when an option to use a trimmed-down
variant of method table had been added (for most of those files we only
need read/write/lseek).  Now we have 3 states - ->d_fsdata pointing to
struct debugfs_fsdata, ->d_fsdata used to stash a pointer to full struct
file_operations and ->d_fsdata used to stash a pointer to trimmed-down
variant.  Not to worry - all those pointers are at least 32bit-aligned,
so we can use the lower couple of bits to tell one state from another
(00, 01 and 11 resp.)

	It's still not the end, though - some drivers have fuckloads of
files for each underlying object, and while debugfs has an unsanitary
degree of fondness for templates, apparently there are some limits.
That gets kludged over in two fairly disgusting ways.  For one thing,
we already store a reference to driver-supplied file_operations, so
we can always have several (dozens of) identical copies of that thing
and have callback ask debugfs to give it the pointer.  Voila - we can
tell whether we are trying to read driver_object->plonk or driver_object->puke
by checking which copy does that point to.  Another kludge is to look
at dentry name - compare it with a bunch of strings until we find the
one we want.

	All of that could be avoided if we augmented debugfs inodes with
a couple of pointers - no more stashing crap in ->d_fsdata, etc.
And it's really not hard to do.  The series below attempts to untangle
that mess; it can be found in
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #work.debugfs

	It's very lightly tested; review and more testing would be
very welcome.

Shortlog:
Al Viro (20):
      debugfs: fix missing mutex_destroy() in short_fops case
      debugfs: separate cache for debugfs inodes
      debugfs: move ->automount into debugfs_inode_info
      debugfs: get rid of dynamically allocation proxy_ops
      debugfs: don't mess with bits in ->d_fsdata
      debugfs: allow to store an additional opaque pointer at file creation
      carl9170: stop embedding file_operations into their objects
      b43: stop embedding struct file_operations into their objects
      b43legacy: make use of debugfs_get_aux()
      netdevsim: don't embed file_operations into your structs
      mediatek: stop messing with ->d_iname
      [not even compile-tested] greybus/camera - stop messing with ->d_iname
      mtu3: don't mess wiht ->d_iname
      xhci: don't mess with ->d_iname
      qat: don't mess with ->d_name
      sof-client-ipc-flood-test: don't mess with ->d_name
      slub: don't mess with ->d_name
      arm_scmi: don't mess with ->d_parent->d_name
      octeontx2: don't mess with ->d_parent or ->d_parent->d_name
      saner replacement for debugfs_rename()

Diffstat:
 Documentation/filesystems/debugfs.rst              |  12 +-
 .../crypto/intel/qat/qat_common/adf_tl_debugfs.c   |  36 +---
 drivers/firmware/arm_scmi/raw_mode.c               |  12 +-
 drivers/net/bonding/bond_debugfs.c                 |   9 +-
 drivers/net/ethernet/amd/xgbe/xgbe-debugfs.c       |  19 +-
 .../ethernet/marvell/octeontx2/af/rvu_debugfs.c    |  76 +++-----
 drivers/net/ethernet/marvell/skge.c                |   5 +-
 drivers/net/ethernet/marvell/sky2.c                |   5 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |   6 +-
 drivers/net/netdevsim/hwstats.c                    |  29 ++-
 drivers/net/wireless/ath/carl9170/debug.c          |  28 ++-
 drivers/net/wireless/broadcom/b43/debugfs.c        |  27 ++-
 drivers/net/wireless/broadcom/b43legacy/debugfs.c  |  26 ++-
 drivers/opp/debugfs.c                              |  10 +-
 drivers/phy/mediatek/phy-mtk-tphy.c                |  40 +---
 drivers/staging/greybus/camera.c                   |  17 +-
 drivers/usb/host/xhci-debugfs.c                    |  25 +--
 drivers/usb/mtu3/mtu3_debugfs.c                    |  40 +---
 fs/debugfs/file.c                                  | 167 ++++++++--------
 fs/debugfs/inode.c                                 | 213 ++++++++++-----------
 fs/debugfs/internal.h                              |  55 +++---
 include/linux/debugfs.h                            |  34 +++-
 mm/shrinker_debug.c                                |  16 +-
 mm/slub.c                                          |  13 +-
 net/hsr/hsr_debugfs.c                              |   9 +-
 net/mac80211/debugfs_netdev.c                      |  11 +-
 net/wireless/core.c                                |   5 +-
 sound/soc/sof/sof-client-ipc-flood-test.c          |  39 ++--
 28 files changed, 399 insertions(+), 585 deletions(-)

Patches' overview:

01/20) debugfs: fix missing mutex_destroy() in short_fops case
	-stable fodder, in one place the logics for "what does ->d_fsdata
contain for this one" got confused.

Meat of the series:

02/20) debugfs: separate cache for debugfs inodes
	Embed them into container (struct debugfs_inode_info, with nothing
else in it at the moment), set the cache up, etc.
	Just the infrastructure changes letting us augment debugfs inodes
here; adding stuff will come at the next step.

03/20) debugfs: move ->automount into debugfs_inode_info
	... and don't bother with debugfs_fsdata for those.  Life's
simpler that way...

04/20) debugfs: get rid of dynamically allocation proxy_ops
	All it takes is having full_proxy_open() collect the information
about available methods and store it in debugfs_fsdata.
	Wrappers are called only after full_proxy_open() has succeeded
calling debugfs_get_file(), so they are guaranteed to have ->d_fsdata
already pointing to debugfs_fsdata.
	As the result, they can check if method is absent and bugger off
early, without any atomic operations, etc. - same effect as what we'd
have from NULL method.  Which makes the entire proxy_fops contents
unconditional, making it completely pointless - we can just put those
methods (unconditionally) into debugfs_full_proxy_file_operations and
forget about dynamic allocation, replace_fops, etc.

05/20) debugfs: don't mess with bits in ->d_fsdata
	The reason we need that crap is the dual use ->d_fsdata has there -
it's both holding a debugfs_fsdata reference after the first
debugfs_file_get() (actually, after the call of proxy ->open())
*and* it serves as a place to stash a reference to real file_operations
from object creation to the first open.  Oh, and it's triple use,
actually - that stashed reference might be to debugfs_short_fops.
	Bugger that for a game of solidiers - just put the operations
reference into debugfs-private augmentation of inode.  And split
debugfs_full_file_operations into full and short cases, so that
debugfs_get_file() could tell one from another.
	Voila - ->d_fsdata holds NULL until the first (successful)
debugfs_get_file() and a reference to struct debugfs_fsdata afterwards.

06/20) debugfs: allow to store an additional opaque pointer at file creation
	Set by debugfs_create_file_aux(name, mode, parent, data, aux, fops).
Plain debugfs_create_file() has it set to NULL.
Accessed by debugfs_get_aux(file).
	Convenience macros for numeric opaque data - debugfs_create_file_aux_num
and debugfs_get_aux_num, resp.

Now we can trim the crap from drivers:

A bunch "let's make a bunch of identical copies of struct file_operations
and use debugfs_real_fops() to tell which field are we asked to operate upon":
07/20) carl9170: stop embedding file_operations into their objects
08/20) b43: stop embedding struct file_operations into their objects
09/20) b43legacy: make use of debugfs_get_aux()
10/20) netdevsim: don't embed file_operations into your structs

BTW, that crack about several dozens?  Literal truth - carl9170 has forty
three such identical copies.  Hidden by creative uses of preprocessor...

A bunch of "let's play with dentry name" case:
11/20) mediatek: stop messing with ->d_iname
12/20) [not even compile-tested] greybus/camera - stop messing with ->d_iname
	depends on BROKEN, so...
13/20) mtu3: don't mess wiht ->d_iname
14/20) xhci: don't mess with ->d_iname
15/20) qat: don't mess with ->d_name
16/20) sof-client-ipc-flood-test: don't mess with ->d_name
17/20) slub: don't mess with ->d_name

Even nastier - these look at the name of parent directory instead.
18/20) arm_scmi: don't mess with ->d_parent->d_name
19/20) octeontx2: don't mess with ->d_parent or ->d_parent->d_name

The last commit is pretty much independent from the rest of series:
20/20) saner replacement for debugfs_rename()
Existing primitive has several problems:
	1) calling conventions are clumsy - it returns a dentry reference
that is either identical to its second argument or is an ERR_PTR(-E...);
in both cases no refcount changes happen.  Inconvenient for users and
bug-prone; it would be better to have it return 0 on success and -E... on
failure.
	2) it allows cross-directory moves; however, no such caller have
ever materialized and considering the way debugfs is used, it's unlikely
to happen in the future.  What's more, any such caller would have fun
issues to deal with wrt interplay with recursive removal.  It also makes
the calling conventions clumsier...
	3) tautological rename fails; the callers have no race-free way
to deal with that.
	4) new name must have been formed by the caller; quite a few
callers have it done by sprintf/kasprintf/etc., ending up with considerable
boilerplate.
	Proposed replacement: int debugfs_change_name(dentry, fmt, ...).
All callers convert to that easily, and it's simpler internally.
	IMO debugfs_rename() should go; if we ever get a real-world use
case for cross-directory moves in debugfs, we can always look into
the right way to handle that.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux