On Wed, 9 Jun 2021 at 20:12, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > Server responds to LOOKUP and other ops (READDIRPLUS/CREATE/MKNOD/...) > with outarg containing nodeid and generation. > > If a fuse inode is found in inode cache with the same nodeid but > different generation, the existing fuse inode should be unhashed and > marked "bad" and a new inode with the new generation should be hashed > instead. > > This can happen, for example, with passhrough fuse filesystem that > returns the real filesystem ino/generation on lookup and where real inode > numbers can get recycled due to real files being unlinked not via the fuse > passthrough filesystem. > > With current code, this situation will not be detected and an old fuse > dentry that used to point to an older generation real inode, can be used > to access a completely new inode, which should be accessed only via the > new dentry. > > Note that because the FORGET message carries the nodeid w/o generation, > the server should wait to get FORGET counts for the nlookup counts of > the old and reused inodes combined, before it can free the resources > associated to that nodeid. > > Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxgDMGUpK35huwqFYGH_idBB8S6eLiz85o0DDKOyDH4Syg@xxxxxxxxxxxxxx/ > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > --- > > Miklos, > > I was able to reproduce this issue with a passthrough fs that stored > ino+generation and uses then to open fd on lookup. > > I extended libfuse's test_syscalls [1] program to demonstrate the issue > described in commit message. > > Max, IIUC, you are making a modification to virtiofs-rs that would > result is being exposed to this bug. You are welcome to try out the > test and let me know if you can reproduce the issue. > > Note that some test_syscalls test fail with cache enabled, so libfuse's > test_examples.py only runs test_syscalls in cache disabled config. > > Thanks, > Amir. > > [1] https://github.com/amir73il/libfuse/commits/test-reused-inodes > > fs/fuse/dir.c | 3 ++- > fs/fuse/fuse_i.h | 9 +++++++++ > fs/fuse/inode.c | 4 ++-- > fs/fuse/readdir.c | 7 +++++-- > 4 files changed, 18 insertions(+), 5 deletions(-) > > diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c > index 1b6c001a7dd1..b06628fd7d8e 100644 > --- a/fs/fuse/dir.c > +++ b/fs/fuse/dir.c > @@ -239,7 +239,8 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags) > if (!ret) { > fi = get_fuse_inode(inode); > if (outarg.nodeid != get_node_id(inode) || > - (bool) IS_AUTOMOUNT(inode) != (bool) (outarg.attr.flags & FUSE_ATTR_SUBMOUNT)) { > + fuse_stale_inode(inode, outarg.generation, > + &outarg.attr)) { This changes behavior in the inode_wrong_type() case. I guess that was not intended. > fuse_queue_forget(fm->fc, forget, > outarg.nodeid, 1); > goto invalid; > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h > index 7e463e220053..f1bd28c176a9 100644 > --- a/fs/fuse/fuse_i.h > +++ b/fs/fuse/fuse_i.h > @@ -867,6 +867,15 @@ static inline u64 fuse_get_attr_version(struct fuse_conn *fc) > return atomic64_read(&fc->attr_version); > } > > +static inline bool fuse_stale_inode(const struct inode *inode, int generation, > + struct fuse_attr *attr) > +{ > + return inode->i_generation != generation || > + inode_wrong_type(inode, attr->mode) || > + (bool) IS_AUTOMOUNT(inode) != > + (bool) (attr->flags & FUSE_ATTR_SUBMOUNT); > +} > + > static inline void fuse_make_bad(struct inode *inode) > { > remove_inode_hash(inode); > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > index 393e36b74dc4..257bb3e1cac8 100644 > --- a/fs/fuse/inode.c > +++ b/fs/fuse/inode.c > @@ -350,8 +350,8 @@ struct inode *fuse_iget(struct super_block *sb, u64 nodeid, > inode->i_generation = generation; > fuse_init_inode(inode, attr); > unlock_new_inode(inode); > - } else if (inode_wrong_type(inode, attr->mode)) { > - /* Inode has changed type, any I/O on the old should fail */ > + } else if (fuse_stale_inode(inode, generation, attr)) { This one adds the automount check. That should be okay, since directories must never have aliases and such a beast would error out anyway later on d_splice_alias. > + /* nodeid was reused, any I/O on the old inode should fail */ > fuse_make_bad(inode); > iput(inode); > goto retry; > diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c > index 277f7041d55a..bc267832310c 100644 > --- a/fs/fuse/readdir.c > +++ b/fs/fuse/readdir.c > @@ -200,9 +200,12 @@ static int fuse_direntplus_link(struct file *file, > if (!d_in_lookup(dentry)) { > struct fuse_inode *fi; > inode = d_inode(dentry); > + if (inode && get_node_id(inode) != o->nodeid) > + inode = NULL; > if (!inode || > - get_node_id(inode) != o->nodeid || > - inode_wrong_type(inode, o->attr.mode)) { > + fuse_stale_inode(inode, o->generation, &o->attr)) { Again, automount check added, I'm not at all sure how automount and readdirplus interact. This needs to be looked at (though it's mostly a separate issue from this patch). Thanks, Miklos