Re: [PTCH] push down lock_super and BKL into ->put_super

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 05, 2009 at 03:40:36PM +0200, Christoph Hellwig wrote:
> From: Christoph Hellwig <hch@xxxxxx>
> 
> Move s_lock and BKL into ->put_super from the only caller.  A couple of
> filesystems had trivial enough ->put_super (only kfree and NULLing of
> s_fs_info + stuff in there) to not get any locking: code, cramfs, efs,
> hugetlbfs, omfs, qnx4, shmem, all others got the full treatment.  Most
> of them probably don't need it, but I'd rather sort that out individually.
> Preferably after all the other BKL and lock_super pushdowns in that area.

NAK.  Getting BKL down there is fine, but lock_super() in that place should
simply die now.

Look: lock_super() does two things - exclusion on s_lock and marking the
task as "holds excl fs resource, other tasks are likely to be blocked by
it, try to push its IO at higher priority" for CFQ.

Let's consider the exclusion first.  We *can't* get contention on s_lock
at that point.  The only thing that takes s_lock is lock_super() itself.
Callers:
	generic_shutdown_super (this one, s_umount held exclusive since
it's always called from ->kill_sb) [1]
	write_super() from sync_supers() - s_umount held shared
	do_remount_sb() - s_umount held [2]
	file_fsync() - we shouldn't ever be in generic_shutdown_super()
while there are files opened on that superblock, TYVM
	__sync_filesystem() - called from sync_filesystems() (s_umount shared)
and from sync_filesystem().
Callers of sync_filesystem():
	fsync_bdev(), freeze_bdev() - s_umount shared.
	fs/cachefiles - there we have a vfsmount with ->mnt_sb pointing to
that superblock, so generic_shutdown_super() shouldn't be possible (active
reference to superblock)
	and generic_shutdown_super() itself.  Again, s_umount exclusive and
in any case, there's only one call of that sucker in the entire lifetime of
given struct super_block.

So we can't have contention on s_lock in that area - any such contention would
either happen on s_umount or would be a result of severely buggered refcounting
on struct super_block.

That leaves us with fs_excl alone.  And fair enough, this thing *does* hold
exclusive fs resources.  Only one, in fact - s_umount.  Nothing is going to
wait for it to finish in order to do something with fs itself.  So if we
are going to bump fs_excl, we ought to do it in deactivate_super(), not
here.  After all, ->put_super() is not where the majority of IO on final
umount will be happening...

As for this patch: we need to
	* replace lock_super/unlock_super() with get_fs_excl()/put_fs_excl()
in the same places.
	* don't do lock_super() inside ->put_super() at all.
	* take BKL down to ->put_super().

I'm going to put it in as two patches - lock_super part + trimmed down
version of this one, sans lock_super.

[1] and from grepping through the instances... some of the callers are
seriously fishy.  All of them *are* in ->kill_sb() and not called other
that via deactivate_super().

[2] normally exclusive, but there are callers holding it shared; all of them
should switch to excl, but that doesn't matter for our purposes.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux