[LSF/MM TOPIC] Making pseudo file systems inodes/dentries more like normal file systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The tracefs file system was designed from the debugfs file system. The
rationale for separating tracefs from debugfs was to allow systems to
enable tracing but still keep debugfs disabled.

The debugfs API centers around dentry, e.g:


struct dentry *debugfs_create_file(const char *name, umode_t mode,
				   struct dentry *parent, void *data,
				   const struct file_operations *fops);

struct dentry *debugfs_create_dir(const char *name, struct dentry *parent);

Where if you need to create a file in debugfs, you call the above
debugfs_create_file() code and it returns a dentry handle, that can be used
to delete that file later. If parent is NULL, it adds the file at the root
of the debugfs file system (/sys/kernel/debug), otherwise you could create
a directory within that file system with the debugfs_create_dir().

Behind the scenes, that dentry also has a created inode structure
representing it. This all happens regardless if debugfs is mounted or not!

As every trace event in the system is represented by a directory and
several files in tracefs's events directory, it created quite a lot of
dentries and inodes.

  # find /sys/kernel/tracing/ | wc -l
18352

And if you create an instance it will duplicate all the events in the
instance directory:

  # mkdir /sys/kernel/tracing/instances/foo
  # find /sys/kernel/tracing/ | wc -l
36617

And that goes for every new instance you make!

  # mkdir /sys/kernel/tracing/instances/bar
  # find /sys/kernel/tracing/ | wc -l
54882

As having inodes and dentries created for all these files and directories
even when they are not used, wastes a lot of memory.

Two years ago at LSF/MM I presented changing how the events directory works
via a new "eventfs" file system. It would still be part of tracefs, but it
would dynamically create the inodes and dentries on the fly.

As I was new to how VFS works, and really didn't understand it as well as I
would have liked, I just got something working and finally submitted it.
But because of my inexperience, Linus had some strong issues against the
code. Part of this was because I was touching dentries when he said I
shouldn't be. But that is because the code was designed from debugfs, which
dentry is the central part of that code.

When Linus said to me:

"And dammit, it shouldn't be necessary. When the tree is mounted, there
 should be no existing dentries."

(I'd share the link, but it was on the security list so there's no public
link for this conversation)

Linus's comment made me realize how debugfs was doing it wrong!

He was right, when a file system is mounted, it should not have any
dentries nor inodes. That's because dentry and inodes are basically "cache"
of the underlining file system. They should only be created when they are
referenced.

The debugfs and tracefs (and possibly other pseudo file systems) should not
be using dentry as a descriptor for the object. It should just create a
generic object that can save the fops, mode, parent, and data, and have the
dentries and inodes created when referenced just like any other file system
would.

Now that I have finished the eventfs file system, I would like to present a
proposal to make a more generic interface that the rest of tracefs and even
debugfs could use that wouldn't rely on dentry as the main handle.

-- Steve




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux