On Mon, Nov 07, 2022 at 01:31:14PM -0800, Nhat Pham wrote: > We have benchmarked the lock consolidation to see the performance effect of > this change on zram. First, we ran a synthetic FS workload on a server machine > with 36 cores (same machine for all runs), using this benchmark script: > > https://github.com/josefbacik/fs_mark > > using 32 threads, and cranking the pressure up to > 80% FS usage. > > Here is the result (unit is file/second): > > With lock consolidation (btrfs): > Average: 13520.2, Median: 13531.0, Stddev: 137.5961482019028 > > Without lock consolidation (btrfs): > Average: 13487.2, Median: 13575.0, Stddev: 309.08283679298665 > > With lock consolidation (ext4): > Average: 16824.4, Median: 16839.0, Stddev: 89.97388510006668 > > Without lock consolidation (ext4) > Average: 16958.0, Median: 16986.0, Stddev: 194.7370021336469 > > As you can see, we observe a 0.3% regression for btrfs, and a 0.9% regression > for ext4. This is a small, barely measurable difference in my opinion. > > For a more realistic scenario, we also tries building the kernel on zram. > Here is the time it takes (in seconds): > > With lock consolidation (btrfs): > real > Average: 319.6, Median: 320.0, Stddev: 0.8944271909999159 > user > Average: 6894.2, Median: 6895.0, Stddev: 25.528415540334656 > sys > Average: 521.4, Median: 522.0, Stddev: 1.51657508881031 > > Without lock consolidation (btrfs): > real > Average: 319.8, Median: 320.0, Stddev: 0.8366600265340756 > user > Average: 6896.6, Median: 6899.0, Stddev: 16.04057355583023 > sys > Average: 520.6, Median: 521.0, Stddev: 1.140175425099138 > > With lock consolidation (ext4): > real > Average: 320.0, Median: 319.0, Stddev: 1.4142135623730951 > user > Average: 6896.8, Median: 6878.0, Stddev: 28.621670111997307 > sys > Average: 521.2, Median: 521.0, Stddev: 1.7888543819998317 > > Without lock consolidation (ext4) > real > Average: 319.6, Median: 319.0, Stddev: 0.8944271909999159 > user > Average: 6886.2, Median: 6887.0, Stddev: 16.93221781102523 > sys > Average: 520.4, Median: 520.0, Stddev: 1.140175425099138 > > The difference is entirely within the noise of a typical run on zram. This > hardly justifies the complexity of maintaining both the pool lock and the class > lock. In fact, for writeback, we would need to introduce yet another lock to I am glad to make the zsmalloc lock scheme simpler without meaning regression since it introduced a lot mess. Please include the test result in description. Thanks for the testing, Nhat.