On Thu, Oct 08, 2009 at 02:36:22PM +0200, Nick Piggin wrote: > vfs > > amples: 273522 > # > # Overhead Command Shared Object > # ........ .............. ................................ > # > 48.24% git [kernel] > | > |--32.37%-- __d_lookup_rcu > |--14.14%-- link_path_walk_rcu > |--7.57%-- _read_unlock > | | > | |--96.46%-- path_init_rcu > | | do_path_lookup > | | user_path_at > | | vfs_fstatat > | | vfs_lstat > | | sys_newlstat > | | system_call_fastpath > | | > | --3.54%-- do_path_lookup > | user_path_at > | vfs_fstatat > | vfs_lstat > | sys_newlstat > | system_call_fastpath > This one is interesting. spin_lock/spin_unlock remains very low, however > read_unlock pops up. This would be... fs->lock. You're using threads > then (rather than processes)? OK, I got rid of this guy from the RCU walk. Basically now hold vfsmount_lock over the entire RCU path walk (which also pins the mnt) and use a seqlock in the fs struct to get a consistent mnt,dentry pair. This also simplifies the walk because we don't need the complexity to avoid mntget/mntput (just do one final mntget on the resulting mnt before dropping vfsmount_lock). vfsmount_lock adds one per-cpu atomic for the spinlock, and we remove two thread-shared atomics for fs->lock so a net win for both single threaded performance and thread-shared scalability. Latency is no problem because we hold rcu_read_lock for the same length of time anyway. The parallel git diff workload is improved by serveral percent. Phew. I think I'll stop about here and try to start getting some of this crap cleaned up and start trying to get the rest of the filesystems done. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html