[..] > > If we really want to compare CONFIG_THP_SWAP on before and after, it > > should be with SSD because that's a more conventional setup. In this > > case the users that have CONFIG_THP_SWAP=y only experience the > > benefits of zswap with this series. You mentioned experimenting with > > usemem to keep the memory allocated longer so that you're able to have > > a fair test with the small SSD swap setup. Did that work? > > Thanks, these are good points. I ran this experiment with mm-unstable 9-17-2024, > commit 248ba8004e76eb335d7e6079724c3ee89a011389. > > Data is based on average of 3 runs of the vm-scalability "usemem" test. Thanks for the results, this makes much more sense. I see you also ran the tests with a larger swap size, which is good. In the next iteration, I would honestly drop the results with --sleep 0 because it's not a fair comparison imo. I see that in most cases we are observing higher sys time with zswap, and sometimes even higher elapsed time, which is concerning. If the sys time is higher when comparing zswap to SSD, but elapsed time is not higher, this can be normal due to compression on the CPU vs. asynchronous disk writes. However, if the sys time increases when comparing CONFIG_THP_SWAP=n before this series and CONFIG_THP_SWAP=y with this series (i.e. comparing zswap with 4K vs. zswap with mTHP), then that's a problem. Also, if the total elapsed time increases, it is also a problem. My main concern is that synchronous compression of an mTHP may be too expensive of an operation to do in one shot. I am wondering if we need to implement asynchronous swapout for zswap, so that it behaves more like swapping to disk from a reclaim perspective. Anyway, there are too many test results now. For the next version, I would suggest only having two different test cases: 1. Comparing zswap 4K vs zswap mTHP. This would be done by comparing CONFIG_THP_SWAP=n to CONFIG_THP_SWAP=y as you did before. 2. Comparing SSD swap mTHP vs zswap mTHP. In both cases, I think we want to use a sufficiently large swapfile and make the usemem processes sleep for a while to maintain the memory allocations. Since we already confirmed the theory about the restricted swapfile results being due to processes immediately exiting, I don't see value in running tests anymore with a restricted swapfile or without sleeping. Thanks!