On Fri, Oct 11, 2024 at 02:38:29PM +0200, Mickaël Salaün wrote: > On Fri, Oct 11, 2024 at 08:22:51AM -0400, Trond Myklebust wrote: > > On Fri, 2024-10-11 at 12:15 +0200, Mickaël Salaün wrote: > > > On Thu, Oct 10, 2024 at 03:28:12PM -0400, Trond Myklebust wrote: > > > > On Thu, 2024-10-10 at 17:26 +0200, Mickaël Salaün wrote: > > > > > When a filesystem manages its own inode numbers, like NFS's > > > > > fileid > > > > > shown > > > > > to user space with getattr(), other part of the kernel may still > > > > > expose > > > > > the private inode->ino through kernel logs and audit. > > > > > > > > > > Another issue is on 32-bit architectures, on which ino_t is 32 > > > > > bits, > > > > > whereas the user space's view of an inode number can still be 64 > > > > > bits. > > > > > > > > > > Add a new inode_get_ino() helper calling the new struct > > > > > inode_operations' get_ino() when set, to get the user space's > > > > > view of > > > > > an > > > > > inode number. inode_get_ino() is called by generic_fillattr(). > > > > > > > > > > Implement get_ino() for NFS. > > > > > > > > > > Cc: Trond Myklebust <trondmy@xxxxxxxxxx> > > > > > Cc: Anna Schumaker <anna@xxxxxxxxxx> > > > > > Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx> > > > > > Cc: Christian Brauner <brauner@xxxxxxxxxx> > > > > > Cc: Jan Kara <jack@xxxxxxx> > > > > > Signed-off-by: Mickaël Salaün <mic@xxxxxxxxxxx> > > > > > --- > > > > > > > > > > I'm not sure about nfs_namespace_getattr(), please review > > > > > carefully. > > > > > > > > > > I guess there are other filesystems exposing inode numbers > > > > > different > > > > > than inode->i_ino, and they should be patched too. > > > > > --- > > > > > fs/nfs/inode.c | 6 ++++-- > > > > > fs/nfs/internal.h | 1 + > > > > > fs/nfs/namespace.c | 2 ++ > > > > > fs/stat.c | 2 +- > > > > > include/linux/fs.h | 9 +++++++++ > > > > > 5 files changed, 17 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c > > > > > index 542c7d97b235..5dfc176b6d92 100644 > > > > > --- a/fs/nfs/inode.c > > > > > +++ b/fs/nfs/inode.c > > > > > @@ -83,18 +83,19 @@ EXPORT_SYMBOL_GPL(nfs_wait_bit_killable); > > > > > > > > > > /** > > > > > * nfs_compat_user_ino64 - returns the user-visible inode number > > > > > - * @fileid: 64-bit fileid > > > > > + * @inode: inode pointer > > > > > * > > > > > * This function returns a 32-bit inode number if the boot > > > > > parameter > > > > > * nfs.enable_ino64 is zero. > > > > > */ > > > > > -u64 nfs_compat_user_ino64(u64 fileid) > > > > > +u64 nfs_compat_user_ino64(const struct *inode) > > > > > { > > > > > #ifdef CONFIG_COMPAT > > > > > compat_ulong_t ino; > > > > > #else > > > > > unsigned long ino; > > > > > #endif > > > > > + u64 fileid = NFS_FILEID(inode); > > > > > > > > > > if (enable_ino64) > > > > > return fileid; > > > > > @@ -103,6 +104,7 @@ u64 nfs_compat_user_ino64(u64 fileid) > > > > > ino ^= fileid >> (sizeof(fileid)-sizeof(ino)) * > > > > > 8; > > > > > return ino; > > > > > } > > > > > +EXPORT_SYMBOL_GPL(nfs_compat_user_ino64); > > > > > > > > > > int nfs_drop_inode(struct inode *inode) > > > > > { > > > > > diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h > > > > > index 430733e3eff2..f5555a71a733 100644 > > > > > --- a/fs/nfs/internal.h > > > > > +++ b/fs/nfs/internal.h > > > > > @@ -451,6 +451,7 @@ extern void nfs_zap_acl_cache(struct inode > > > > > *inode); > > > > > extern void nfs_set_cache_invalid(struct inode *inode, unsigned > > > > > long > > > > > flags); > > > > > extern bool nfs_check_cache_invalid(struct inode *, unsigned > > > > > long); > > > > > extern int nfs_wait_bit_killable(struct wait_bit_key *key, int > > > > > mode); > > > > > +extern u64 nfs_compat_user_ino64(const struct *inode); > > > > > > > > > > #if IS_ENABLED(CONFIG_NFS_LOCALIO) > > > > > /* localio.c */ > > > > > diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c > > > > > index e7494cdd957e..d9b1e0606833 100644 > > > > > --- a/fs/nfs/namespace.c > > > > > +++ b/fs/nfs/namespace.c > > > > > @@ -232,11 +232,13 @@ nfs_namespace_setattr(struct mnt_idmap > > > > > *idmap, > > > > > struct dentry *dentry, > > > > > const struct inode_operations nfs_mountpoint_inode_operations = > > > > > { > > > > > .getattr = nfs_getattr, > > > > > .setattr = nfs_setattr, > > > > > + .get_ino = nfs_compat_user_ino64, > > > > > }; > > > > > > > > > > const struct inode_operations nfs_referral_inode_operations = { > > > > > .getattr = nfs_namespace_getattr, > > > > > .setattr = nfs_namespace_setattr, > > > > > + .get_ino = nfs_compat_user_ino64, > > > > > }; > > > > > > > > > > static void nfs_expire_automounts(struct work_struct *work) > > > > > diff --git a/fs/stat.c b/fs/stat.c > > > > > index 41e598376d7e..05636919f94b 100644 > > > > > --- a/fs/stat.c > > > > > +++ b/fs/stat.c > > > > > @@ -50,7 +50,7 @@ void generic_fillattr(struct mnt_idmap *idmap, > > > > > u32 > > > > > request_mask, > > > > > vfsgid_t vfsgid = i_gid_into_vfsgid(idmap, inode); > > > > > > > > > > stat->dev = inode->i_sb->s_dev; > > > > > - stat->ino = inode->i_ino; > > > > > + stat->ino = inode_get_ino(inode); > > > > > stat->mode = inode->i_mode; > > > > > stat->nlink = inode->i_nlink; > > > > > stat->uid = vfsuid_into_kuid(vfsuid); > > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > > > > index e3c603d01337..0eba09a21cf7 100644 > > > > > --- a/include/linux/fs.h > > > > > +++ b/include/linux/fs.h > > > > > @@ -2165,6 +2165,7 @@ struct inode_operations { > > > > > struct dentry *dentry, struct > > > > > fileattr > > > > > *fa); > > > > > int (*fileattr_get)(struct dentry *dentry, struct > > > > > fileattr > > > > > *fa); > > > > > struct offset_ctx *(*get_offset_ctx)(struct inode > > > > > *inode); > > > > > + u64 (*get_ino)(const struct inode *inode); > > > > > } ____cacheline_aligned; > > > > > > > > > > static inline int call_mmap(struct file *file, struct > > > > > vm_area_struct > > > > > *vma) > > > > > @@ -2172,6 +2173,14 @@ static inline int call_mmap(struct file > > > > > *file, > > > > > struct vm_area_struct *vma) > > > > > return file->f_op->mmap(file, vma); > > > > > } > > > > > > > > > > +static inline u64 inode_get_ino(struct inode *inode) > > > > > +{ > > > > > + if (unlikely(inode->i_op->get_ino)) > > > > > + return inode->i_op->get_ino(inode); > > > > > + > > > > > + return inode->i_ino; > > > > > +} > > > > > + > > > > > extern ssize_t vfs_read(struct file *, char __user *, size_t, > > > > > loff_t > > > > > *); > > > > > extern ssize_t vfs_write(struct file *, const char __user *, > > > > > size_t, > > > > > loff_t *); > > > > > extern ssize_t vfs_copy_file_range(struct file *, loff_t , > > > > > struct > > > > > file *, > > > > > > > > There should be no need to add this callback to generic_fillattr(). > > > > > > > > generic_fillattr() is a helper function for use by the filesystems > > > > themselves. It should never be called from any outside functions, > > > > as > > > > the inode number would be far from the only attribute that will be > > > > incorrect. > > > > > > This change will not impact filesystems except the ones that > > > implement the new > > > get_ino() operation, and I suspect NFS is not or will not be the only > > > one. We > > > need to investigate on other filesystems but I wanted to get a first > > > feedback > > > before. Using get_ino() in generic_fillattr() should guarantee a > > > consistent > > > getattr() wrt inode numbers. I forgot to remove the now-useless call > > > to > > > nfs_compat_user_ino64() in nfs_getattr() for this to make sense: > > > > You're missing my point. From the point of view of NFS, all you're > > doing is to replace a relatively fast direct call to > > nfs_compat_user_ino64() with a much slower callback. There is no > > benefit at all to anyone in doing so. > > > > Yes, other filesystems may also want to replace this and/or other > > fields in the "struct kstat" that they return, but none of them should > > have a problem with doing that after the actual call to > > generic_fillattr(). > > OK, I'll remove this part then. My concern is about maintaining consistency. If this get_ino() operation is not visible to filesystem developers, I think there is a good chance for desynchronization between the getattr() implementation and get get_ino(). Anyway, I guess the performance argument wins. > > > > > > > -- > > Trond Myklebust > > Linux NFS client maintainer, Hammerspace > > trond.myklebust@xxxxxxxxxxxxxxx > > > >