On Mon 23-10-23 17:30:49, Amir Goldstein wrote: > There are many "simple" filesystems (*) that report null f_fsid in > statfs(2). Those "simple" filesystems report sb->s_dev as the st_dev > field of the stat syscalls for all inodes of the filesystem (**). > > In order to enable fanotify reporting of events with fsid on those > "simple" filesystems, report the sb->s_dev number in f_fsid field of > statfs(2). > > (*) For most of the "simple" filesystem refered to in this commit, the > ->statfs() operation is simple_statfs(). Some of those fs assign the > simple_statfs() method directly in their ->s_op struct and some assign it > indirectly via a call to simple_fill_super() or to pseudo_fs_fill_super() > with either custom or "simple" s_op. > We also make the same change to efivarfs and hugetlbfs, although they do > not use simple_statfs(), because they use the simple_* inode opreations > (e.g. simple_lookup()). > > (**) For most of the "simple" filesystems, the ->getattr() method is not > assigned, so stat() is implemented by generic_fillattr(). A few "simple" > filesystem use the simple_getattr() method which also calls > generic_fillattr() to fill most of the stat struct. > > The two exceptions are procfs and 9p. procfs implements several different > ->getattr() methods, but they all end up calling generic_fillattr() to > fill the st_dev field from sb->s_dev. > > 9p has more complicated ->getattr() methods, but they too, end up calling > generic_fillattr() to fill the st_dev field from sb->s_dev. > > Note that 9p and kernfs also call simple_statfs() from custom ->statfs() > methods which already fill the f_fsid field, but v9fs_statfs() calls > simple_statfs() only in case f_fsid was not filled and kenrfs_statfs() > overwrites f_fsid after calling simple_statfs(). > > Link: https://lore.kernel.org/r/20230919094820.g5bwharbmy2dq46w@quack3/ > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> Looks good. I agree with your choice of sb->s_dev for fsid. Feel free to add: Reviewed-by: Jan Kara <jack@xxxxxxx> Honza > --- > > Jan, > > This is a variant of the approach that you suggested in the Link above. > The two variations from your suggestion are: > > 1. I chose to use s_dev instead of s_uuid - I see no point in generating > s_uuid for those simple filesystems. IMO, for the simple filesystems > without open_by_handle_at() support, fanotify fid doesn't need to be > more unique than {st_dev,st_ino}, because the inode mark pins the > inode and prevent st_dev,st_ino collisions. > > 2. f_fsid is not filled by vfs according to fstype flag, but by > ->statfs() implementations (simple_statfs() for the majority). > > When applied together with the generic AT_HANDLE_FID support patches [1], > all of those simple filesystems can be watches with FAN_ERPORT_FID. > > According to your audit of filesystems in the Link above, this leaves: > "afs, coda, nfs - networking filesystems where inotify and fanotify have > dubious value anyway. > > freevxfs - the only real filesystem without f_fsid. Trivial to handle one > way or the other. > " > > There are two other filesystems that I found in my audit which also don't > fill f_fsid: fuse and gfs2. > > fuse is also a sort of a networking filesystems. Also, fuse supports NFS > export (as does nfs in some configurations) and I would like to stick to > the rule that filesystems the support decodable file handles, use an fsid > that is more unique than s_dev. > > gfs2 already has s_uuid, so we know what f_fsid should be. > BTW, afs also has a server uuid, it just doesn't set s_uuid. > > For btrfs, which fills a non-null, but non-uniform fsid, I already have > patches for inode_get_fsid [2] per your suggestion. > > IMO, we can defer dealing with all those remaining cases for later and > solve the "simple" cases first. > > Do you agree? > > So far, there were no objections to the generic AT_HANDLE_FID support > patches [1], although I am still waiting on an ACK from you on the last > patch. If this fsid patch is also aaceptable, do you think they could > be candidated for upcoming 6.7? > > Thanks, > Amir. > > [1] https://lore.kernel.org/r/20231018100000.2453965-1-amir73il@xxxxxxxxx/ > [2] https://github.com/amir73il/linux/commits/inode_fsid > > fs/efivarfs/super.c | 2 ++ > fs/hugetlbfs/inode.c | 2 ++ > fs/libfs.c | 3 +++ > 3 files changed, 7 insertions(+) > > diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c > index 996271473609..2933090ad11f 100644 > --- a/fs/efivarfs/super.c > +++ b/fs/efivarfs/super.c > @@ -30,6 +30,7 @@ static int efivarfs_statfs(struct dentry *dentry, struct kstatfs *buf) > EFI_VARIABLE_BOOTSERVICE_ACCESS | > EFI_VARIABLE_RUNTIME_ACCESS; > u64 storage_space, remaining_space, max_variable_size; > + u64 id = huge_encode_dev(dentry->d_sb->s_dev); > efi_status_t status; > > /* Some UEFI firmware does not implement QueryVariableInfo() */ > @@ -53,6 +54,7 @@ static int efivarfs_statfs(struct dentry *dentry, struct kstatfs *buf) > buf->f_blocks = storage_space; > buf->f_bfree = remaining_space; > buf->f_type = dentry->d_sb->s_magic; > + buf->f_fsid = u64_to_fsid(id); > > /* > * In f_bavail we declare the free space that the kernel will allow writing > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > index 316c4cebd3f3..c003a27be6fe 100644 > --- a/fs/hugetlbfs/inode.c > +++ b/fs/hugetlbfs/inode.c > @@ -1204,7 +1204,9 @@ static int hugetlbfs_statfs(struct dentry *dentry, struct kstatfs *buf) > { > struct hugetlbfs_sb_info *sbinfo = HUGETLBFS_SB(dentry->d_sb); > struct hstate *h = hstate_inode(d_inode(dentry)); > + u64 id = huge_encode_dev(dentry->d_sb->s_dev); > > + buf->f_fsid = u64_to_fsid(id); > buf->f_type = HUGETLBFS_MAGIC; > buf->f_bsize = huge_page_size(h); > if (sbinfo) { > diff --git a/fs/libfs.c b/fs/libfs.c > index 37f2d34ee090..8117b24b929d 100644 > --- a/fs/libfs.c > +++ b/fs/libfs.c > @@ -41,6 +41,9 @@ EXPORT_SYMBOL(simple_getattr); > > int simple_statfs(struct dentry *dentry, struct kstatfs *buf) > { > + u64 id = huge_encode_dev(dentry->d_sb->s_dev); > + > + buf->f_fsid = u64_to_fsid(id); > buf->f_type = dentry->d_sb->s_magic; > buf->f_bsize = PAGE_SIZE; > buf->f_namelen = NAME_MAX; > -- > 2.34.1 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR