On Tue, Apr 07, 2020 at 11:35:46AM +0900, Levi wrote: > When we try to bind mount on network namespace (ex) /proc/{pid}/ns/net, > inode's private data can have dangling pointer to net_namespace that was > already freed in below case. > > 1. Forking the process. > 2. [PARENT] Waiting the Child till the end. > 3. [CHILD] call unshare for creating new network namespace > 4. [CHILD] Bind mount with /proc/self/ns/net to some mount point. > 5. [CHILD] Exit child. > 6. [PARENT] Try to setns with binded mount point > > In step 5, net_namespace made by child process'll be freed, > But in bind mount point, it still held the pointer to net_namespace made > by child process. > In this situation, when parent try to call "setns" systemcall with the > bind mount point, parent process try to access to freed memory, That > makes memory corruption. > > This patch fix the above scenario by increaseing reference count. This can't be the right fix. > +#ifdef CONFIG_NET_NS > + if (!(flag & CL_COPY_MNT_NS_FILE) && is_net_ns_file(root)) { > + ns = get_proc_ns(d_inode(root)); > + if (ns == NULL || ns->ops->type != CLONE_NEWNET) { > + err = -EINVAL; > + > + goto out_free; > + } > + > + net = to_net_ns(ns); > + net = get_net(net); No. This is completely wrong. You have each struct mount pointing to that sucker to grab an extra reference on an object; you do *NOT* drop said reference when struct mount is destroyed. You are papering over a dangling pointer of some sort by introducing a trivially exploitable leak that happens to hit your scenario. Hell, do (your step 4 + umount your mountpoint) in a loop, then watch what happens to refcounts with that patch. This is bollocks; the reference is *NOT* in struct mount. At all. It's not even in struct dentry. What it's attached to is struct inode and it should be pinned as long as that inode is alive - it's dropped in nsfs_evict(). And if _that_ gets called while dentry is still pinned (as ->mnt_root of something), you have much worse problems. Could you post a reproducer, preferably one that would trigger an oops on mainline?