ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes: > Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes: > >> On Mon, Mar 12, 2018 at 03:23:44PM -0500, Eric W. Biederman wrote: >> >>> Of the two code paths you are concert about: >>> >>> For path path_connected looking at s_root is a heuristic to avoid >>> calling is_subdir every time we need to do that check. If the heuristic >>> fails we still have is_subdir which should remain accurate. If >>> is_subdir fails the path is genuinely not connected at that moment >>> and failing is the correct thing to do. >> >> Umm... That might be not good enough - the logics is "everything's >> reachable from ->s_root anyway, so we might as well not bother checking". >> For NFS it's simply not true. > > If I am parsing the code correctly path_connected is broken for nfsv2 > and nfsv3 when NFS_MOUNT_UNSHARED is not set. nfsv4 appears to make > a kernel mount of the real root of the filesystem properly setting > s_root and then finds the child it is mounting. > >> We can mount server:/foo/bar/baz on /tmp/a, then server:/foo on /tmp/b >> and we'll have ->s_root pointing to a subtree of what's reachable at >> /tmp/b. Play with renames under /tmp/b and you just might end up with >> a problem. And mount on /tmp/a will be (mistakenly) considered to >> be safe, since it satisfies the heuristics in path_connected(). > > Agreed. > > Which means that if you mount server:/foo/bar/baz first and then > mount server:/foo with an appropriate rename you might be able > to see all of server:/foo or possibly server:/ > > Hmm.. > > Given that nfs_kill_super calls generic_shutdown_super and > generic_shutdown_super calls shrink_dcache_for_umount > I would argue that nfsv2 and nfsv3 are buggy in the same > case, as shrink_dcache_for_umount is called on something > that is not the root of the filesystem's dentry tree. Ah. I see now there is now the s_roots list that handles that bit of strangeness. So one path is to simply remove the heuristic from path_connected. Another path is to have nfsv2 and nfsv3 not set s_root at all. Leaving the heuristic working for the rest of the filesystems, and generally simplifying the code. Something like the diff below I should think. Eric fs/dcache.c | 3 ++- fs/nfs/getroot.c | 35 +---------------------------------- fs/nfs/super.c | 2 -- 3 files changed, 3 insertions(+), 37 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 7c38f39958bc..ebe63ca026da 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1501,7 +1501,8 @@ void shrink_dcache_for_umount(struct super_block *sb) dentry = sb->s_root; sb->s_root = NULL; - do_one_tree(dentry); + if (dentry) + do_one_tree(dentry); while (!hlist_bl_empty(&sb->s_roots)) { dentry = dget(hlist_bl_entry(hlist_bl_first(&sb->s_roots), struct dentry, d_hash)); diff --git a/fs/nfs/getroot.c b/fs/nfs/getroot.c index 391dafaf9182..f609d583fd2b 100644 --- a/fs/nfs/getroot.c +++ b/fs/nfs/getroot.c @@ -36,35 +36,6 @@ #define NFSDBG_FACILITY NFSDBG_CLIENT -/* - * Set the superblock root dentry. - * Note that this function frees the inode in case of error. - */ -static int nfs_superblock_set_dummy_root(struct super_block *sb, struct inode *inode) -{ - /* The mntroot acts as the dummy root dentry for this superblock */ - if (sb->s_root == NULL) { - sb->s_root = d_make_root(inode); - if (sb->s_root == NULL) - return -ENOMEM; - ihold(inode); - /* - * Ensure that this dentry is invisible to d_find_alias(). - * Otherwise, it may be spliced into the tree by - * d_splice_alias if a parent directory from the same - * filesystem gets mounted at a later time. - * This again causes shrink_dcache_for_umount_subtree() to - * Oops, since the test for IS_ROOT() will fail. - */ - spin_lock(&d_inode(sb->s_root)->i_lock); - spin_lock(&sb->s_root->d_lock); - hlist_del_init(&sb->s_root->d_u.d_alias); - spin_unlock(&sb->s_root->d_lock); - spin_unlock(&d_inode(sb->s_root)->i_lock); - } - return 0; -} - /* * get an NFS2/NFS3 root dentry from the root filehandle */ @@ -102,11 +73,7 @@ struct dentry *nfs_get_root(struct super_block *sb, struct nfs_fh *mntfh, goto out; } - error = nfs_superblock_set_dummy_root(sb, inode); - if (error != 0) { - ret = ERR_PTR(error); - goto out; - } + /* Leave nfsv2 and nfsv3 s_root == NULL */ /* root dentries normally start off anonymous and get spliced in later * if the dentry tree reaches them; however if the dentry already diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 29bacdc56f6a..30b45ea4dfe9 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -2625,9 +2625,7 @@ struct dentry *nfs_fs_mount_common(struct nfs_server *server, } s->s_bdi->ra_pages = server->rpages * NFS_MAX_READAHEAD; server->super = s; - } - if (!s->s_root) { /* initial superblock/root creation */ mount_info->fill_super(s, mount_info); nfs_get_cache_cookie(s, mount_info->parsed, mount_info->cloned);