Re: [PATCH 0/5] samples/kernfs: Add a pseudo-filesystem to demonstrate kernfs usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 28 Jan 2025 12:51:47 -1000
Tejun Heo <tj@xxxxxxxxxx> wrote:

> Just for context, kernfs is factored out from sysfs. One of the factors
> which drove the design was memory overhead. On large systems (IIRC
> especially with iSCSI), there can be a huge number of sysfs nodes and
> allocating a dentry and inode pair for each file made some machines run out
> of memory during boot, so sysfs implemented memory-backed filesystem store
> which then made its interface to its users to depart from the VFS layer.
> This requirement holds for cgroup too - there are systems with a *lot* of
> cgroups and the associated interface files and we don't want to pin a dentry
> and inode for all of them.
> 

Right. And going back to ramfs, it too has a dentry and inode for every
file that is created. Thus, if you have a lot of files, you'll have a lot
of memory dedicated to their dentry and inodes that will never be freed.
The ramfs_create() and ramfs_mkdir() both call ramfs_mknod() which does a
d_instantiate() and a dget() on the dentry so they are persistent until
they are deleted or a reboot happens.

What I did for eventfs, and what I believe kernfs does, is to create a
small descriptor to represent the control data and reference them like what
you would have on disk. That is, the control elements (like an trace event
descriptor) is really what is on "disk". When someone does an "ls" to the
pseudo file system, there needs to be a way for the VFS layer to query the
control structures like how a normal file system would query that data
stored on disk, and then let the VFS layer create the dentry and inodes
when referenced, and more importantly, free them when they are no longer
referenced and there's memory pressure.

I believe kernfs does the same thing. And my point is, it would be nice to
have an abstract layer that represent control descriptors that may be
around for the entirety of the boot (like trace events are) without needing
to pin a dentry and inode for each one of theses files. Currently, that
abstract layer is kernfs.

-- Steve




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux