Re: [rfc][patch] store-free path walking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 08, 2009 at 02:36:22PM +0200, Nick Piggin wrote:
> vfs
> 
> amples: 273522
> #
> # Overhead         Command                     Shared Object
> # ........  ..............  ................................
> #
>     48.24%             git  [kernel]
>                 |
>                 |--32.37%-- __d_lookup_rcu
>                 |--14.14%-- link_path_walk_rcu
>                 |--7.57%-- _read_unlock
>                 |          |
>                 |          |--96.46%-- path_init_rcu
>                 |          |          do_path_lookup
>                 |          |          user_path_at
>                 |          |          vfs_fstatat
>                 |          |          vfs_lstat
>                 |          |          sys_newlstat
>                 |          |          system_call_fastpath
>                 |          |
>                 |           --3.54%-- do_path_lookup
>                 |                     user_path_at
>                 |                     vfs_fstatat
>                 |                     vfs_lstat
>                 |                     sys_newlstat
>                 |                     system_call_fastpath

> This one is interesting. spin_lock/spin_unlock remains very low, however
> read_unlock pops up. This would be... fs->lock. You're using threads
> then (rather than processes)?

OK, I got rid of this guy from the RCU walk. Basically now hold
vfsmount_lock over the entire RCU path walk (which also pins the mnt)
and use a seqlock in the fs struct to get a consistent mnt,dentry
pair. This also simplifies the walk because we don't need the
complexity to avoid mntget/mntput (just do one final mntget on the
resulting mnt before dropping vfsmount_lock).

vfsmount_lock adds one per-cpu atomic for the spinlock, and we
remove two thread-shared atomics for fs->lock so a net win for
both single threaded performance and thread-shared scalability.
Latency is no problem because we hold rcu_read_lock for the same
length of time anyway.

The parallel git diff workload is improved by serveral percent.

Phew. I think I'll stop about here and try to start getting some of
this crap cleaned up and start trying to get the rest of the
filesystems done.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux