On Mon, Jul 10, 2023 at 12:40 PM Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > On Mon, Jul 10, 2023 at 11:33:38AM -0700, Ivan Babrou wrote: > > The following two commits added the same thing for tmpfs: > > > > * commit 2b4db79618ad ("tmpfs: generate random sb->s_uuid") > > * commit 59cda49ecf6c ("shmem: allow reporting fanotify events with file handles on tmpfs") > > > > Having fsid allows using fanotify, which is especially handy for cgroups, > > where one might be interested in knowing when they are created or removed. > > > > Signed-off-by: Ivan Babrou <ivan@xxxxxxxxxxxxxx> > > --- > > fs/kernfs/mount.c | 13 ++++++++++++- > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c > > index d49606accb07..930026842359 100644 > > --- a/fs/kernfs/mount.c > > +++ b/fs/kernfs/mount.c > > @@ -16,6 +16,8 @@ > > #include <linux/namei.h> > > #include <linux/seq_file.h> > > #include <linux/exportfs.h> > > +#include <linux/uuid.h> > > +#include <linux/statfs.h> > > > > #include "kernfs-internal.h" > > > > @@ -45,8 +47,15 @@ static int kernfs_sop_show_path(struct seq_file *sf, struct dentry *dentry) > > return 0; > > } > > > > +int kernfs_statfs(struct dentry *dentry, struct kstatfs *buf) > > +{ > > + simple_statfs(dentry, buf); > > + buf->f_fsid = uuid_to_fsid(dentry->d_sb->s_uuid.b); > > + return 0; > > +} > > + > > const struct super_operations kernfs_sops = { > > - .statfs = simple_statfs, > > + .statfs = kernfs_statfs, > > .drop_inode = generic_delete_inode, > > .evict_inode = kernfs_evict_inode, > > > > @@ -351,6 +360,8 @@ int kernfs_get_tree(struct fs_context *fc) > > } > > sb->s_flags |= SB_ACTIVE; > > > > + uuid_gen(&sb->s_uuid); > > Since kernfs has as lot of nodes (like hundreds of thousands if not more > at times, being created at boot time), did you just slow down creating > them all, and increase the memory usage in a measurable way? This is just for the superblock, not every inode. The memory increase is one UUID per kernfs instance (there are maybe 10 of them on a basic system), which is trivial. Same goes for CPU usage. > We were trying to slim things down, what userspace tools need this > change? Who is going to use it, and what for? The one concrete thing is ebpf_exporter: * https://github.com/cloudflare/ebpf_exporter I want to monitor cgroup changes, so that I can have an up to date map of inode -> cgroup path, so that I can resolve the value returned from bpf_get_current_cgroup_id() into something that a human can easily grasp (think system.slice/nginx.service). Currently I do a full sweep to build a map, which doesn't work if a cgroup is short lived, as it just disappears before I can resolve it. Unfortunately, systemd recycles cgroups on restart, changing inode number, so this is a very real issue. There's also this old wiki page from systemd: * https://freedesktop.org/wiki/Software/systemd/Optimizations Quoting from there: > Get rid of systemd-cgroups-agent. Currently, whenever a systemd cgroup runs empty a tool "systemd-cgroups-agent" is invoked by the kernel which then notifies systemd about it. The need for this tool should really go away, which will save a number of forked processes at boot, and should make things faster (especially shutdown). This requires introduction of a new kernel interface to get notifications for cgroups running empty, for example via fanotify() on cgroupfs. So a similar need to mine, but for different systemd-related needs. Initially I tried adding this for cgroup fs only, but the problem felt very generic, so I pivoted to having it in kernfs instead, so that any kernfs based filesystem would benefit. Given pretty much non-existing overhead and simplicity of this, I think it's a change worth doing, unless there's a good reason to not do it. I cc'd plenty of people to make sure it's not a bad decision. > There were some benchmarks people were doing with booting large memory > systems that you might want to reproduce here to verify that nothing is > going to be harmed. Skipping this given that overhead is per superblock and trivial.