On Sat, Feb 12, 2022 at 05:12:19AM +0900, Alexey Avramov wrote: > Aggressive swapping even with vm.swappiness=1 with MGLRU > ======================================================== > > Reading a large mmapped file leads to a super agressive swapping. > Reducing vm.swappiness even to 1 does not have effect. Mind explaining why you think it's "super agressive"? I assume you expected a different behavior that would perform better. If so, please spell it out. > Demo: https://www.youtube.com/watch?v=J81kwJeuW58 > > Linux 5.17-rc3, Multigenerational LRU v7, > vm.swappiness=1, MemTotal: 11.5 GiB. > > $ cache-bench -r 35000 -m1 -b1 -p1 -f test20000 > Reading mmapped file (file size: 20000 MiB) > cache-bench v0.2.0: https://github.com/hakavlad/cache-bench Writing your own benchmark is a good exercise but fio is the standard benchmark in this case. Please use it with --ioengine=mmap. > Swapping started with MemAvailable=71%. > At the end 33 GiB was swapped out when MemAvailable=60%. > > Is it OK? MemAvailable is an estimate (free + page cache), and it doesn't imply any reclaim preferences. In the worst case scenario, e.g., out of swap space, MemAvailable *may* be reclaimed. Here is my benchmark result with file mmap + *high* swap usage. Ram disk was used to reduce the variance in the result (and SSD wear out if you care). More details on additional configurations here: https://lore.kernel.org/linux-mm/20220208081902.3550911-6-yuzhao@xxxxxxxxxx/ Mixed workloads: fio (buffered I/O): +13% IOPS BW 5.17-rc3: 275k 1075MiB/s v7: 313k 1222MiB/s memcached (anon): +12% Ops/sec KB/sec 5.17-rc3: 511282.72 19861.04 v7: 572408.80 22235.49 cat mmap.sh systemctl restart memcached swapoff -a umount /mnt rmmod brd modprobe brd rd_nr=2 rd_size=56623104 mkswap /dev/ram0 swapon /dev/ram0 mkfs.ext4 /dev/ram1 mount -t ext4 /dev/ram1 /mnt memtier_benchmark -S /var/run/memcached/memcached.sock \ -P memcache_binary -n allkeys --key-minimum=1 \ --key-maximum=50000000 --key-pattern=P:P -c 1 \ -t 36 --ratio 1:0 --pipeline 8 -d 2000 sysctl vm.overcommit_memory=1 fio -name=mglru --numjobs=36 --directory=/mnt --size=1408m \ --buffered=1 --ioengine=mmap --iodepth=128 --iodepth_batch_submit=32 \ --iodepth_batch_complete=32 --rw=randread --random_distribution=random \ --norandommap --time_based --ramp_time=10m --runtime=990m \ --group_reporting & pid=$! sleep 200 memcached.sock -P memcache_binary -n allkeys --key-minimum=1 \ --key-maximum=50000000 --key-pattern=R:R -c 1 -t 36 --ratio 0:1 \ --pipeline 8 --randomize --distinct-client-seed kill -INT $pid wait