Hello, this is my version of patches to improve orphan list scaling by reducing amount of work done under global s_orphan_mutex. We are in disagreement with Thavatchai whose patches are better (see thread http://www.spinics.net/lists/linux-ext4/msg43220.html) so I guess it's up to Ted or other people on this list to decide. When running code stressing orphan list operations [1] with these patches, I see s_orphan_lock to move from number 1 in lock_stat report to unmeasurable. So with the patches there are other much more problematic locks (superblock buffer lock and bh_state lock, j_list_lock, buffer locks for inode buffers when several inodes share a block...). The average times for 10 runs for the test program to run on my 48-way box with ext4 on ramdisk are: Vanilla Patched Procs Avg Stddev Avg Stddev 1 2.769200 0.056194 2.890700 0.061727 2 5.756500 0.313268 4.383500 0.161629 4 11.852500 0.130221 6.542900 0.160039 10 33.590900 0.394888 27.749500 0.615517 20 71.035400 0.320914 76.368700 3.734557 40 236.671100 2.856885 228.236800 2.361391 So we can see the biggest speedup was for 2, 4, and 10 threads. For higher thread counts the contention and cache bouncing prevented any significant speedup (we can even see a barely-out-of-noise performance drop for 20 threads). Changes since v2: * Fixed up various bugs in error handling pointed out by Thavatchai and some others as well * Somewhat reduced critical sections under s_orphan_lock [1] The test program runs given number of processes, each process is truncating a 4k file by 1 byte until it reaches 1 byte size and then the file is extended to 4k again. Honza -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html