On Mon, Oct 2, 2023 at 5:37 AM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > ovl_permission() accesses ->layers[...].mnt; we can't have ->layers > freed without an RCU delay on fs shutdown. Fortunately, kern_unmount_array() > used to drop those mounts does include an RCU delay, so freeing is > delayed; unfortunately, the array passed to kern_unmount_array() is > formed by mangling ->layers contents and that happens without any > delays. > > Use a separate array instead; local if we have a few layers, > kmalloc'ed if there's a lot of them. If allocation fails, > fall back to kern_unmount() for individual mounts; it's > not a fast path by any stretch of imagination. If that is the case, then having 3 different code paths seems quite an excessive over optimization... I think there is a better way - layout the mounts array linearly in ofs->mounts[] to begin with, remove .mnt out of ovl_layer and use ofs->mounts[layer->idx] to get to a layer's mount. I can try to write this patch to see how it ends up looking. Thanks, Amir. > > Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx> > --- > fs/overlayfs/ovl_entry.h | 1 - > fs/overlayfs/params.c | 26 ++++++++++++++++++++------ > 2 files changed, 20 insertions(+), 7 deletions(-) > > diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h > index e9539f98e86a..618b63bb7987 100644 > --- a/fs/overlayfs/ovl_entry.h > +++ b/fs/overlayfs/ovl_entry.h > @@ -30,7 +30,6 @@ struct ovl_sb { > }; > > struct ovl_layer { > - /* ovl_free_fs() relies on @mnt being the first member! */ > struct vfsmount *mnt; > /* Trap in ovl inode cache */ > struct inode *trap; > diff --git a/fs/overlayfs/params.c b/fs/overlayfs/params.c > index b9355bb6d75a..ab594fd407b4 100644 > --- a/fs/overlayfs/params.c > +++ b/fs/overlayfs/params.c > @@ -738,8 +738,15 @@ int ovl_init_fs_context(struct fs_context *fc) > void ovl_free_fs(struct ovl_fs *ofs) > { > struct vfsmount **mounts; > + struct vfsmount *m[16]; > + unsigned n = ofs->numlayer; > unsigned i; > > + if (n > 16) > + mounts = kmalloc_array(n, sizeof(struct mount *), GFP_KERNEL); > + else > + mounts = m; > + > iput(ofs->workbasedir_trap); > iput(ofs->indexdir_trap); > iput(ofs->workdir_trap); > @@ -752,14 +759,21 @@ void ovl_free_fs(struct ovl_fs *ofs) > if (ofs->upperdir_locked) > ovl_inuse_unlock(ovl_upper_mnt(ofs)->mnt_root); > > - /* Hack! Reuse ofs->layers as a vfsmount array before freeing it */ > - mounts = (struct vfsmount **) ofs->layers; If we are getting rid of the hack, we should remove the associated static_assert() statements in ovl_entry.h. > - for (i = 0; i < ofs->numlayer; i++) { > + for (i = 0; i < n; i++) { > iput(ofs->layers[i].trap); > - mounts[i] = ofs->layers[i].mnt; > - kfree(ofs->layers[i].name); > + if (unlikely(!mounts)) > + kern_unmount(ofs->layers[i].mnt); > + else > + mounts[i] = ofs->layers[i].mnt; > } > - kern_unmount_array(mounts, ofs->numlayer); > + if (mounts) { > + kern_unmount_array(mounts, n); > + if (mounts != m) > + kfree(mounts); > + } > + // by this point we had an RCU delay from kern_unmount{_array,}() > + for (i = 0; i < n; i++) > + kfree(ofs->layers[i].name); > kfree(ofs->layers); > for (i = 0; i < ofs->numfs; i++) > free_anon_bdev(ofs->fs[i].pseudo_dev); > -- > 2.39.2 >