[PATCH DRAFT 0/4] : Port tracefs to kernfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Back in 2022 we already had a session at LSFMM where we talked about
eventfs and we said that it should be based on kernfs and any missing
functionality be implemented in kernfs. Instead we've gotten a
hand-rolled version of similar functionality and 100+ mails exchanges
over the last weeks to fix bugs in there binding people's time.

All we've heard so far were either claims that it would be too difficult
to port tracefs to kernfs or that it somehow wouldn't work but we've
never heard why and it's never been demonstrated why.

So I went and started a draft for porting all of tracefs to kernfs in
the hopes that someone picks this up and finishes the work. I've gotten
the core of it done and it's pretty easy to do logical copy-pasta to
port this to eventfs as well.

I want to see tracefs and eventfs ported to kernfs and get rid of the
hand-rolled implementation. I don't see the value in any additional
talks about why eventfs is special until we've seen an implementation of
tracefs on kernfs.

I'm pretty certain that we have capable people that can and want to
finish the port (I frankly don't have time for this unless I drop all
reviews.). I've started just jotting down the basics yesterday evening
and came to the conclusion that:

* It'll get rid of pointless dentry pinning in various places that is
  currently done in the first place. Instead only a kernfs root and a
  kernfs node need to be stashed. Dentries and inodes are added
  on-demand.

* It'll make _all of_ tracefs capable of on-demand dentry and inode
  creation.

* Quoting [1]:

  > The biggest savings in eventfs is the fact that it has no meta data for
  > files. All the directories in eventfs has a fixed number of files when they
  > are created. The creating of a directory passes in an array that has a list
  > of names and callbacks to call when the file needs to be accessed. Note,
  > this array is static for all events. That is, there's one array for all
  > event files, and one array for all event systems, they are not allocated per
  > directory.

  This is all possible with kernfs.

* All ownership information (mode, uid, gid) is stashed and kept
  kernfs_node->iattrs. So the parent kernfs_node's ownership can be used
  to set the child's ownership information. This will allow to get rid
  of any custom permission checking and ->getattr() and ->setattr()
  calls.

* Private tracefs data that was stashed in inode->i_private is stashed
  in kernfs_node->priv. That's always accessible in kernfs->open() calls
  via kernfs_open_file->kn->priv but it could also be transferred to
  kernfs_open_file->priv. In any case, it makes it a lot easier to
  handle private data than tracefs does it now.

* It'll make maintenance of tracefs easier in the long run because new
  functionality and improvements get added to kernfs including better
  integration with namespaces (I've had patchsets for kernfs a while ago
  to unlock additional namespaces.)

* There's no need for separate i_ops for "instances" and regular tracefs
  directories. Simply compare the stashed kernfs_node of the "instances"
  directory against the current kernfs_node passed to ->mkdir() or
  ->rmdir() whether the directory creation or deletion is allowed.

* Frankly, another big reason to do it is simply maintenance. All of the
  maintenance burden neeeds to be shifted to the generic kernfs
  implementation which is maintained by people familar with filesystem
  details. I'm willing to support it too.

  No shade, but currently I don't see how eventfs can be maintained
  without the involvement of others. Maintainability alone should be a
  sufficient reason to move all of this to kernfs and add any missing
  functionality.

* If we have a session about this at LSFMM and I want to see a POC of
  tracefs and eventfs built on top of kernfs. I'm tired of talking about
  a private implementation of functionality that already exists.
  Otherwise, this is just wasting everyone's time and eventfs as it is
  will not become common infrastructure.

* Yes, debugfs could or should be ported as well but it's almost
  irrelevant for debugfs. It's a debugging filesystem. If you enable it
  on a production workload then you have bigger problems to worry about
  than wasted memory. So I don't consider that urgent. But tracefs is
  causing us headaches right now and I'm weary of cementing a
  hand-rolled implementation.

So really, please let's move this to kernfs, fix any things that aren't
supported in kernfs (I haven't seen any) and get rid of all the custom
functionality. Part of the work is moving tracefs to the new mount api
(which should've been done anyway).

The fs/tracefs/ part already compiles. The rest I haven't finished
converting. All the file_operations need to be moved to kernfs_ops which
shouldn't be too difficult.

To: Steven Rostedt <rostedt@xxxxxxxxxxx>
To: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
To: Amir Goldstein <amir73il@xxxxxxxxx>
To: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: lsf-pc@xxxxxxxxxxxxxxxxxxxxxxxxxx,
Cc: linux-fsdevel@xxxxxxxxxxxxxxx
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>

Link: https://lore.kernel.org/r/20240129105726.2c2f77f0@xxxxxxxxxxxxxxxxxx [1]
Link: https://lore.kernel.org/r/20240129105726.2c2f77f0@xxxxxxxxxxxxxxxxxx
---
Christian Brauner (4):
      [DRAFT]: tracefs: port to kernfs
      [DRAFT]: trace: stash kernfs_node instead of dentries
      [DRAFT]: hwlat: port struct file_operations thread_mode_fops to struct kernfs_ops
      [DRAFT]: trace: illustrate how to convert basic open functions

 fs/kernfs/mount.c                 |  10 +
 fs/tracefs/inode.c                | 649 +++++++++++++-------------------------
 include/linux/kernfs.h            |   3 +
 include/linux/tracefs.h           |  18 +-
 kernel/trace/trace.c              |  22 +-
 kernel/trace/trace.h              |   4 +-
 kernel/trace/trace_events_synth.c |   4 +-
 kernel/trace/trace_events_user.c  |   2 +-
 kernel/trace/trace_hwlat.c        |  45 +--
 9 files changed, 270 insertions(+), 487 deletions(-)
---
base-commit: 41bccc98fb7931d63d03f326a746ac4d429c1dd3
change-id: 20240131-tracefs-kernfs-3f2def6eab11





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux