On Mon 08-07-13 22:44:53, Dave Chinner wrote: <snipped some nice XFS results ;)> > So, lets look at ext4 vs btrfs vs XFS at 16-way (this is on the > 3.10-cil kernel I've been testing XFS on): > > create walk unlink > time(s) rate time(s) time(s) > xfs 222 266k+-32k 170 295 > ext4 978 54k+- 2k 325 2053 > btrfs 1223 47k+- 8k 366 12000(*) > > (*) Estimate based on a removal rate of 18.5 minutes for the first > 4.8 million inodes. > > Basically, neither btrfs or ext4 have any concurrency scaling to > demonstrate, and unlinks on btrfs a just plain woeful. Thanks for posting the numbers. There isn't anyone seriously testing ext4 SMP scalability AFAIK so it's not surprising it sucks. > ext4 create rate is limited by the extent cache LRU locking: > > - 41.81% [kernel] [k] __ticket_spin_trylock > - __ticket_spin_trylock > - 60.67% _raw_spin_lock > - 99.60% ext4_es_lru_add > + 99.63% ext4_es_lookup_extent At least this should improve with the patches in 3.11-rc1. > - 39.15% do_raw_spin_lock > - _raw_spin_lock > + 95.38% ext4_es_lru_add > 0.51% insert_inode_locked > __ext4_new_inode > - 16.20% [kernel] [k] native_read_tsc > - native_read_tsc > - 60.91% delay_tsc > __delay > do_raw_spin_lock > + _raw_spin_lock > - 39.09% __delay > do_raw_spin_lock > + _raw_spin_lock > > Ext4 unlink is serialised on orphan list processing: > > - 12.67% [kernel] [k] __mutex_unlock_slowpath > - __mutex_unlock_slowpath > - 99.95% mutex_unlock > + 54.37% ext4_orphan_del > + 43.26% ext4_orphan_add > + 5.33% [kernel] [k] __mutex_lock_slowpath ext4 can do better here I'm sure. The current solution is pretty simplistic. At least we could use spinlock for in-memory orphan list and atomic ops for on disk one (as it's only singly linked list). But not sure if I find time to look into this in forseeable future... Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html