On Tue, 28 Jan 2025 14:05:05 -0800 Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > Well, honestly, you were doing some odd things. Some of those odd things were because of the use of the dentry as a handle, which also required making an inode for every file. When the number of event files blew up to 10s of thousands, that caused a lot of memory to be used. > > For a *simple* filesystem that actually acts as a filesystem, all you > need is in libfs with things like &simple_dir_operations etc. > > And we have a *lot* of perfectly regular users of things like that. > Not like the ftrace mess that had very *non*-filesystem semantics with > separate lifetime confusion etc, and that tried to maintain a separate > notion of permissions etc. I would also say that the proc file system is rather messy. But that's very old and has a long history which probably built up its complexity. > > To make matters worse, tracefs than had a completely different model > for events, and these interacted oddly in non-filesystem ways. Ideally, I rather it not have done it that way. To save memory, since every event in eventfs has the same files, it was better to just make a single array that represents those files for every event. That saved over 20 megabytes per tracing instance. > > In other words, all the tracefs problems were self-inflicted, and a > lot of them were because you wanted to go behind the vfs layers back > because you had millions of nodes but didn't want to have millions of > inodes etc. > > That's not normal. > > I mean, you can pretty much literally look at ramfs: > > fs/ramfs/inode.c > > and it is a real example filesystem that does a lot of things, but > almost all of it is just using the direct vfs helpers (simple_lookup / > simple_link/ simple_rmdir etc etc). It plays *zero* games with > dentries. It's also a storage file system. It's just that it stores to memory which looks like it simply uses the page cache where it never needs to write it to disk. It's not a good example for a control interface. > > Or look at fs/pstore. Another storage device. > > Or any number of other examples. > > And no, nobody should *EVER* look at the horror that is tracefs and eventfs. I believe kernfs is to cover control interfaces like sysfs and debugfs, that actually changes kernel behavior when their files are written to. It's also likely why procfs is such a mess because that too is a control interface. Yes, eventfs is "special", but tracefs could easily be converted to kernfs. I believe Christian even wrote a POC that did that. -- Steve