Hello, this is a fourth iteration of my series improving handling of sync syscall. Since previous submission I have slightly improved cleaned up iteration loops so that we don't have to pass void * around. Christoph also asked about why we do non-blocking ->sync_fs() pass. My answer to it was: I did also measurements where non-blocking ->sync_fs was removed and I didn't see any regression with ext3, ext4, xfs, or btrfs. OTOH I can imagine *some* filesystem can do an equivalent of filemap_fdatawrite() on some metadata for non-blocking ->sync_fs and filemap_fdatawrite_and_wait() on the blocking one and if there are more such filesystems on different backing storages the performance difference can be noticeable (actually, checking the filesystems, JFS and Ceph seem to be doing something like this). So I that's why I didn't include the change in the end... So Christoph, if you think we should get rid of non-blocking ->sync_fs, I can include the patch but personally I think it has some use. Arguably a cleaner interface for the users will be something like two methods ->sync_fs_begin and ->sync_fs_end. Filesystems that don't have much to optimize in ->sync_fs() would just use one of these functions. I have run three tests below to verify performance impact of the patch series. Each test has been run with 1, 2, and 4 filesystems mounted; test with 2 filesystems was run with each filesystem on a different disk, test with 4 filesystems had 2 filesystems on the first disk and 2 filesystems on the second disk. Test 1: Run 200 times sync with filesystem mounted to verify overhead of sync when there are no data to write. Test 2: For each filesystem run a process creating 40 KB files, sleep for 3 seconds, run sync. Test 3: For each filesystem run a process creating 20 GB file, sleep for 5 seconds, run sync. I have performed 10 runs of each test for xfs, ext3, ext4, and btrfs filesystems. Results of test 1 ----------------- Numbers are time it took 200 syncs to complete. Character in braces is + if the time increased with 2*STDDEV reliability, - if it decreased with 2*STDDEV reliability, 0 otherwise. BASE PATCHED FS AVG STDDEV AVG STDDEV xfs, 1 disks 4.189300 0.051525 2.141300 0.063389 (-) xfs, 2 disks 4.820600 0.019096 4.611400 0.066322 (-) xfs, 4 disks 6.518300 1.440362 6.435700 0.510641 (0) ext4, 1 disks 4.085000 0.011375 1.689500 0.001360 (-) ext4, 2 disks 4.088100 0.006488 1.705000 0.026359 (-) ext4, 4 disks 4.107300 0.011934 1.702900 0.001814 (-) ext3, 1 disks 4.080200 0.009527 1.703400 0.030559 (-) ext3, 2 disks 4.138300 0.143909 1.694000 0.001414 (-) ext3, 4 disks 4.107200 0.002482 1.702900 0.007778 (-) btrfs, 1 disks 11.214600 0.086619 8.737200 0.081076 (-) btrfs, 2 disks 32.910000 0.162089 30.673400 0.538820 (-) btrfs, 4 disks 67.987700 1.655654 67.247100 1.971887 (0) So we see nice improvements almost all over the board. Results of test 2 ----------------- Numbers are time it took sync to complete. BASE PATCHED FS AVG STDDEV AVG STDDEV xfs, 1 disks 0.436000 0.012000 0.506000 0.014283 (+) xfs, 2 disks 1.105000 0.055543 1.274000 0.244426 (0) xfs, 4 disks 5.880000 2.997135 4.837000 3.875448 (0) ext4, 1 disks 0.791000 0.055579 0.853000 0.042438 (0) ext4, 2 disks 18.232000 13.505638 17.254000 2.000506 (0) ext4, 4 disks 491.790000 218.565229 696.783000 234.933562 (0) ext3, 1 disks 15.315000 2.065465 1.900000 0.184662 (-) ext3, 2 disks 128.524000 18.090519 55.278000 1.530554 (-) ext3, 4 disks 221.202000 30.090432 232.849000 68.745423 (0) btrfs, 1 disks 0.452000 0.026000 0.494000 0.023749 (0) btrfs, 2 disks 5.156000 4.530852 4.083000 1.560519 (0) btrfs, 4 disks 31.154000 11.220828 36.987000 17.334126 (0) Except for ext3 which got a nice boost here and XFS which seems to be a tad bit slower, there are no changes that would stand out of the noise. Results of test 3 ----------------- Numbers are time it took sync to complete. BASE PATCHED FS AVG STDDEV AVG STDDEV xfs, 1 disks 12.083000 0.058660 10.898000 0.285475 (-) xfs, 2 disks 20.182000 0.549614 14.977000 0.351114 (-) xfs, 4 disks 35.814000 5.318310 28.452000 3.332281 (0) ext4, 1 disks 32.956000 5.753789 20.865000 3.892098 (0) ext4, 2 disks 34.922000 3.051966 27.411000 2.752978 (0) ext4, 4 disks 44.508000 6.829004 28.360000 2.561437 (0) ext3, 1 disks 23.475000 1.288885 17.116000 0.319631 (-) ext3, 2 disks 43.508000 4.998647 41.547000 2.597976 (0) ext3, 4 disks 92.130000 11.344117 79.362000 9.891208 (0) btrfs, 1 disks 12.478000 0.394304 12.847000 0.171117 (0) btrfs, 2 disks 15.030000 0.777817 18.014000 2.011418 (0) btrfs, 4 disks 32.395000 4.248859 38.411000 3.179939 (0) Here we see XFS and ext3 had some improvements, ext4 likely as well although the results are relatively noisy. Honza -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html