On Tue, May 12, 2020 at 10:13 AM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote: > > When a file is only in a lower layer or in no layer at all, after > lookup a negative dentry will be generated in the upper layer or > even worse many negetive dentries will be generated in upper/lower > layers. These negative dentries will be useless after construction > of overlayfs' own dentry and may keep in the memory long time even > after unmount of overlayfs instance. This patch tries to kill > unnecessary negative dentry during lookup. > > Signed-off-by: Chengguang Xu <cgxu519@xxxxxxxxxxxx> > --- > v1->v2: > - Only drop negative dentry after slow lookup. > > fs/namei.c | 9 ++++++--- > fs/overlayfs/namei.c | 35 ++++++++++++++++++++++++++++++++++- > include/linux/namei.h | 8 ++++++++ > 3 files changed, 48 insertions(+), 4 deletions(-) > > diff --git a/fs/namei.c b/fs/namei.c > index a320371899cf..1cc2960c7804 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -1386,7 +1386,7 @@ static inline int handle_mounts(struct nameidata *nd, struct dentry *dentry, > * This looks up the name in dcache and possibly revalidates the found dentry. > * NULL is returned if the dentry does not exist in the cache. > */ > -static struct dentry *lookup_dcache(const struct qstr *name, > +struct dentry *lookup_dcache(const struct qstr *name, > struct dentry *dir, > unsigned int flags) > { > @@ -1402,6 +1402,7 @@ static struct dentry *lookup_dcache(const struct qstr *name, > } > return dentry; > } > +EXPORT_SYMBOL(lookup_dcache); > > /* > * Parent directory has inode locked exclusive. This is one > @@ -1500,7 +1501,7 @@ static struct dentry *lookup_fast(struct nameidata *nd, > } > > /* Fast lookup failed, do it the slow way */ > -static struct dentry *__lookup_slow(const struct qstr *name, > +struct dentry *__lookup_slow(const struct qstr *name, > struct dentry *dir, > unsigned int flags) > { > @@ -1536,6 +1537,7 @@ static struct dentry *__lookup_slow(const struct qstr *name, > } > return dentry; > } > +EXPORT_SYMBOL(__lookup_slow); > > static struct dentry *lookup_slow(const struct qstr *name, > struct dentry *dir, > @@ -2460,7 +2462,7 @@ int vfs_path_lookup(struct dentry *dentry, struct vfsmount *mnt, > } > EXPORT_SYMBOL(vfs_path_lookup); > > -static int lookup_one_len_common(const char *name, struct dentry *base, > +int lookup_one_len_common(const char *name, struct dentry *base, > int len, struct qstr *this) > { > this->name = name; > @@ -2491,6 +2493,7 @@ static int lookup_one_len_common(const char *name, struct dentry *base, > > return inode_permission(base->d_inode, MAY_EXEC); > } > +EXPORT_SYMBOL(lookup_one_len_common); > > /** > * try_lookup_one_len - filesystem helper to lookup single pathname component > diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c > index 723d17744758..d8e71173ea75 100644 > --- a/fs/overlayfs/namei.c > +++ b/fs/overlayfs/namei.c > @@ -191,6 +191,39 @@ static bool ovl_is_opaquedir(struct dentry *dentry) > return ovl_check_dir_xattr(dentry, OVL_XATTR_OPAQUE); > } > > +static struct dentry *ovl_lookup_positive_unlocked(const char *name, > + struct dentry *base, > + int len) > +{ > + struct qstr this; > + struct dentry *ret; > + bool not_found = false; > + int err; > + > + err = lookup_one_len_common(name, base, len, &this); > + if (err) > + return ERR_PTR(err); > + > + ret = lookup_dcache(&this, base, 0); > + if (ret) > + return ret; > + > + inode_lock_shared(base->d_inode); > + ret = __lookup_slow(&this, base, 0); > + if (!IS_ERR(ret) && > + d_flags_negative(ret->d_flags)) { > + not_found = true; > + d_drop(ret); > + } > + inode_unlock_shared(base->d_inode); > + > + if (not_found) { > + dput(ret); > + ret = ERR_PTR(-ENOENT); > + } > + return ret; > +} > + I think there was a misunderstanding. This helper should be in vfs code, not duplicating vfs code and please don't duplicate code in vfs either. I think you can use a lookup flag (LOOKUP_POSITIVE_CACHE???) to describe the desired behavior and implement it inside lookup_slow(). Document the semantics as well as explain in the context of the helper the cases where modules might find this useful (because they have higher level caches). Besides the fact that this helper really needs review by Al and that duplicating subtle code is wrong in so many levels, I suppose the functionality could prove useful to other subsystems as well. Thanks, Amir.