On Wed, Oct 18, 2023 at 5:28 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Wed, 2023-10-18 at 13:00 +0300, Amir Goldstein wrote: > > AT_HANDLE_FID was added as an API for name_to_handle_at() that request > > the encoding of a file id, which is not intended to be decoded. > > > > This file id is used by fanotify to describe objects in events. > > > > So far, overlayfs is the only filesystem that supports encoding > > non-decodeable file ids, by providing export_operations with an > > ->encode_fh() method and without a ->decode_fh() method. > > > > Add support for encoding non-decodeable file ids to all the filesystems > > that do not provide export_operations, by encoding a file id of type > > FILEID_INO64_GEN from { i_ino, i_generation }. > > > > A filesystem may that does not support NFS export, can opt-out of > > encoding non-decodeable file ids for fanotify by defining an empty > > export_operations struct (i.e. with a NULL ->encode_fh() method). > > > > This allows the use of fanotify events with file ids on filesystems > > like 9p which do not support NFS export to bring fanotify in feature > > parity with inotify on those filesystems. > > > > Note that fanotify also requires that the filesystems report a non-null > > fsid. Currently, many simple filesystems that have support for inotify > > (e.g. debugfs, tracefs, sysfs) report a null fsid, so can still not be > > used with fanotify in file id reporting mode. > > > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > > --- > > fs/exportfs/expfs.c | 30 +++++++++++++++++++++++++++--- > > include/linux/exportfs.h | 10 +++++++--- > > 2 files changed, 34 insertions(+), 6 deletions(-) > > > > diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c > > index 30da4539e257..34e7d835d4ef 100644 > > --- a/fs/exportfs/expfs.c > > +++ b/fs/exportfs/expfs.c > > @@ -383,6 +383,30 @@ int generic_encode_ino32_fh(struct inode *inode, __u32 *fh, int *max_len, > > } > > EXPORT_SYMBOL_GPL(generic_encode_ino32_fh); > > > > +/** > > + * exportfs_encode_ino64_fid - encode non-decodeable 64bit ino file id > > + * @inode: the object to encode > > + * @fid: where to store the file handle fragment > > + * @max_len: maximum length to store there > > + * > > + * This generic function is used to encode a non-decodeable file id for > > + * fanotify for filesystems that do not support NFS export. > > + */ > > +static int exportfs_encode_ino64_fid(struct inode *inode, struct fid *fid, > > + int *max_len) > > +{ > > + if (*max_len < 3) { > > + *max_len = 3; > > + return FILEID_INVALID; > > + } > > + > > + fid->i64.ino = inode->i_ino; > > + fid->i64.gen = inode->i_generation; > > Note that i_ino is unsigned long and so is a 32-bit value on 32-bit > arches. If the backend storage uses 64-bit inodes, then we usually end > up hashing them down to 32-bits first. e.g. see nfs_fattr_to_ino_t(). > ceph has some similar code. > > The upshot is that if you're relying on i_ino, the value can change > between different arches, even when they are dealing with the same > backend filesystem. > > Since this is expected to be used by filesystems that don't set up > export operations, then that may just be something they have to deal > with. I'm not sure what else you can use in lieu of i_ino in this case. > True. That is one more justification for patch [1/5]. If we only support watching inodes when the reported fid is {i_ino, i_generation}, then the likelihood of collision drops considerably compared to watching sb/mount. Because watcher cannot decode fid, watcher must use name_to_handle_at() when setting up the inode watch and store it in some index, so it knows how to map from event->fid to inode later. This means that even if there is a fid collision between two inodes, then the watcher can detect the collision when setting up the inode watch. Thanks, Amir.