On Sun, Nov 3, 2019 at 8:52 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > lower_dentry can't go from positive to negative (we have it pinned), > but it *can* go from negative to positive. So fetching ->d_inode > into a local variable, doing a blocking allocation, checking that > now ->d_inode is non-NULL and feeding the value we'd fetched > earlier to a function that won't accept NULL is not a good idea. > > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx> > --- > diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c > index a905d5f4f3b0..3c2298721359 100644 > --- a/fs/ecryptfs/inode.c > +++ b/fs/ecryptfs/inode.c > @@ -319,7 +319,7 @@ static int ecryptfs_i_size_read(struct dentry *dentry, struct inode *inode) > static struct dentry *ecryptfs_lookup_interpose(struct dentry *dentry, > struct dentry *lower_dentry) > { > - struct inode *inode, *lower_inode = d_inode(lower_dentry); > + struct inode *inode, *lower_inode; > struct ecryptfs_dentry_info *dentry_info; > struct vfsmount *lower_mnt; > int rc = 0; > @@ -339,7 +339,15 @@ static struct dentry *ecryptfs_lookup_interpose(struct dentry *dentry, > dentry_info->lower_path.mnt = lower_mnt; > dentry_info->lower_path.dentry = lower_dentry; > > - if (d_really_is_negative(lower_dentry)) { > + /* > + * negative dentry can go positive under us here - its parent is not > + * locked. That's OK and that could happen just as we return from > + * ecryptfs_lookup() anyway. Just need to be careful and fetch > + * ->d_inode only once - it's not stable here. > + */ > + lower_inode = READ_ONCE(lower_dentry->d_inode); > + > + if (!lower_inode) { > /* We want to add because we couldn't find in lower */ > d_add(dentry, NULL); > return NULL; Sigh! Open coding a human readable macro to solve a subtle lookup race. That doesn't sound like a scalable solution. I have a feeling this is not the last patch we will be seeing along those lines. Seeing that developers already confused about when they should use d_really_is_negative() over d_is_negative() [1] and we probably don't want to add d_really_really_is_negative(), how about applying that READ_ONCE into d_really_is_negative() and re-purpose it as a macro to be used when races with lookup are a concern? Thanks, Amir. [1] https://lore.kernel.org/linux-fsdevel/20190903135803.GA25692@hsiangkao-HP-ZHAN-66-Pro-G1/