Re: [GIT PULL] Detaching mounts on unlink for 3.15

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 17, 2014 at 11:12:03PM +0100, Al Viro wrote:

> I'd probably turn mntput_no_expire() into something like
> static struct mount *__mntput(struct mount *m)
> that would return NULL if nothing needs to be killed and returned m
> if m really needs killing.  Leaving the caller to decide what to do
> with that puppy.  We have, as it is, exactly two callers - exit path
> in sys_umount() and mntput().  So we add two more functions:
> static void kill_mnt_async(struct mount *m)
> and
> static void kill_mnt_sync(struct mount *m)
> both being no-op on NULL.  Then in sys_umount() and mntput() we do
> 	kill_mnt_async(__mntput(mnt));
> and in mntput_sync() - kill_mnt_sync(__mntput(mnt));
> For that matter, kill_mnt_sync() (basically, your variant with completions)
> can be folded into mntput_sync().

Actually, all kern_unmount() callers are doing that from fairly shallow
stack depth and all simple_release_fs() ones are dealing with rather
trivial ->kill_sb().  So mntput_sync() is an overkill; all we need is
	if (mnt->mnt_flags & MNT_INTERNAL) {
		cleanup_mnt(mnt);
		return;
	}
	<do task_work_add or schedule_delayed_work song and dance>
right in the end of mntput_no_expire().  OK, now I have something that
looks like a complete solution.  The last missing bit is to take all
filp_close() of acct->file to kernel thread, and have them done via
__fput_sync() there.  Then auto-close (done from cleanup_mnt()) will
consist of shutting down all affected acct and waiting for that kernel
thread to run through everything currently in its queue.  That'll take
care of waiting until acct(NULL) done by somebody else gets through closing
the file and through corresponding mntput().  And *those* mntput() also
can be synchronous - they are clones of the one we hadn't finished shutting
down yet, so both dput() and deactivate_super() will bugger off immediately.
So we just mark those instead-of-mnt_pin() clones as MNT_INTERNAL.  Voila.
After that ->mnt_pinned crap dies, acct auto-close ought to be race-free
and we get the actual fs shutdown guaranteed to be on shallow stack, without
extra context switches, etc. in the normal case.

	Let's see if that survives testing...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux