Hi Dave, On Mon, Jul 08, 2013 at 10:44:53PM +1000, Dave Chinner wrote: [...] > So, lets look at ext4 vs btrfs vs XFS at 16-way (this is on the > 3.10-cil kernel I've been testing XFS on): > > create walk unlink > time(s) rate time(s) time(s) > xfs 222 266k+-32k 170 295 > ext4 978 54k+- 2k 325 2053 > btrfs 1223 47k+- 8k 366 12000(*) > > (*) Estimate based on a removal rate of 18.5 minutes for the first > 4.8 million inodes. > > Basically, neither btrfs or ext4 have any concurrency scaling to > demonstrate, and unlinks on btrfs a just plain woeful. > > ext4 create rate is limited by the extent cache LRU locking: I have a patch to fix this problem and the patch has been applied into 3.11-rc1. The patch is (d3922a77): ext4: improve extent cache shrink mechanism to avoid to burn CPU time I do really appreicate that if you could try your testing again against this patch. I just want to make sure that this problem has been fixed. At least in my own testing it looks fine. Thanks, - Zheng > > - 41.81% [kernel] [k] __ticket_spin_trylock > - __ticket_spin_trylock > - 60.67% _raw_spin_lock > - 99.60% ext4_es_lru_add > + 99.63% ext4_es_lookup_extent > - 39.15% do_raw_spin_lock > - _raw_spin_lock > + 95.38% ext4_es_lru_add > 0.51% insert_inode_locked > __ext4_new_inode > - 16.20% [kernel] [k] native_read_tsc > - native_read_tsc > - 60.91% delay_tsc > __delay > do_raw_spin_lock > + _raw_spin_lock > - 39.09% __delay > do_raw_spin_lock > + _raw_spin_lock > > Ext4 unlink is serialised on orphan list processing: > > - 12.67% [kernel] [k] __mutex_unlock_slowpath > - __mutex_unlock_slowpath > - 99.95% mutex_unlock > + 54.37% ext4_orphan_del > + 43.26% ext4_orphan_add > + 5.33% [kernel] [k] __mutex_lock_slowpath > > > btrfs create has tree lock problems: > > - 21.68% [kernel] [k] __write_lock_failed > - __write_lock_failed > - 99.93% do_raw_write_lock > - _raw_write_lock > - 79.04% btrfs_try_tree_write_lock > - btrfs_search_slot > - 97.48% btrfs_insert_empty_items > 99.82% btrfs_new_inode > + 2.52% btrfs_lookup_inode > - 20.37% btrfs_tree_lock > - 99.38% btrfs_search_slot > 99.92% btrfs_insert_empty_items > 0.52% btrfs_lock_root_node > btrfs_search_slot > btrfs_insert_empty_items > - 21.24% [kernel] [k] _raw_spin_unlock_irqrestore > - _raw_spin_unlock_irqrestore > - 61.22% prepare_to_wait > + 61.52% btrfs_tree_lock > + 32.31% btrfs_tree_read_lock > 6.17% reserve_metadata_bytes > btrfs_block_rsv_add > > btrfs walk phase hammers the inode_hash_lock: > > - 18.45% [kernel] [k] __ticket_spin_trylock > - __ticket_spin_trylock > - 47.38% _raw_spin_lock > + 42.99% iget5_locked > + 15.17% __remove_inode_hash > + 13.77% btrfs_get_delayed_node > + 11.27% inode_tree_add > + 9.32% btrfs_destroy_inode > ..... > - 46.77% do_raw_spin_lock > - _raw_spin_lock > + 30.51% iget5_locked > + 11.40% __remove_inode_hash > + 11.38% btrfs_get_delayed_node > + 9.45% inode_tree_add > + 7.28% btrfs_destroy_inode > ..... > > I have a RCU inode hash lookup patch floating around somewhere if > someone wants it... > > And, well, the less said about btrfs unlinks the better: > > + 37.14% [kernel] [k] _raw_spin_unlock_irqrestore > + 33.18% [kernel] [k] __write_lock_failed > + 17.96% [kernel] [k] __read_lock_failed > + 1.35% [kernel] [k] _raw_spin_unlock_irq > + 0.82% [kernel] [k] __do_softirq > + 0.53% [kernel] [k] btrfs_tree_lock > + 0.41% [kernel] [k] btrfs_tree_read_lock > + 0.41% [kernel] [k] do_raw_read_lock > + 0.39% [kernel] [k] do_raw_write_lock > + 0.38% [kernel] [k] btrfs_clear_lock_blocking_rw > + 0.37% [kernel] [k] free_extent_buffer > + 0.36% [kernel] [k] btrfs_tree_read_unlock > + 0.32% [kernel] [k] do_raw_write_unlock > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html