[RFC] umount/__detach_mounts() race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



	umount_tree() is very definitely not supposed to be called
on MNT_UMOUNT subtrees (== stuck-together fragments that got
unmounted, but not split into individual mount nodes).  Refcounting
rules are different there and umount_tree() assumes that we start with
the normal ones.

	do_umount() appears to be checking for that:

	if (flags & MNT_DETACH) {
		if (!list_empty(&mnt->mnt_list))
			umount_tree(mnt, UMOUNT_PROPAGATE);
		retval = 0;
	} else {
		shrink_submounts(mnt);
		retval = -EBUSY;
		if (!propagate_mount_busy(mnt, 2)) {
			if (!list_empty(&mnt->mnt_list))
				umount_tree(mnt, UMOUNT_PROPAGATE|UMOUNT_SYNC);
			retval = 0;
		}
	}

which would prevent umount_tree() on those - mnt_list eviction happens
for the same nodes that get MNT_UMOUNT.  However, shrink_submounts()
will call umount_tree() for e.g. nfs automounts it finds on victim
mount, and if ours happens to be already unmounted, with automounts
stuck to it, we have trouble.

It looks like something that should be impossible to hit, but...

A: umount(2) looks the sucker up
B: rmdir(2) in another namespace (where nothing is mounted on that mountpoint)
does __detach_mounts(), which grabs namespace_sem, sees the mount A is about
to try and kill and calls umount_tree(mnt, UMOUNT_CONNECTED).  Which detaches
our mount (and its children, automounts included) from the namespace it's in,
modifies their refcounts accordingly and keeps the entire thing in one
piece.
A: in do_umount() blocks on namespace_sem
B: drops namespace_sem
A: gets to the quoted code.  mnt is already MNT_UMOUNT (and has empty
->mnt_list), but it does have (equally MNT_UMOUNT) automounts under it,
etc.  So shrink_submounts() finds something to umount and calls umount_tree().
Buggered refcounts happen.

Does anybody see a problem with the following patch?

diff --git a/fs/namespace.c b/fs/namespace.c
index 42d4fc21263b2..1604b9d7a69d9 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1654,21 +1654,20 @@ static int do_umount(struct mount *mnt, int flags)
 	lock_mount_hash();
 
 	/* Recheck MNT_LOCKED with the locks held */
+	/* ... and don't go there if we'd raced and it's already unmounted */
 	retval = -EINVAL;
-	if (mnt->mnt.mnt_flags & MNT_LOCKED)
+	if (mnt->mnt.mnt_flags & (MNT_LOCKED|MNT_UMOUNT))
 		goto out;
 
 	event++;
 	if (flags & MNT_DETACH) {
-		if (!list_empty(&mnt->mnt_list))
-			umount_tree(mnt, UMOUNT_PROPAGATE);
+		umount_tree(mnt, UMOUNT_PROPAGATE);
 		retval = 0;
 	} else {
 		shrink_submounts(mnt);
 		retval = -EBUSY;
 		if (!propagate_mount_busy(mnt, 2)) {
-			if (!list_empty(&mnt->mnt_list))
-				umount_tree(mnt, UMOUNT_PROPAGATE|UMOUNT_SYNC);
+			umount_tree(mnt, UMOUNT_PROPAGATE|UMOUNT_SYNC);
 			retval = 0;
 		}
 	}



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux