On 2024/1/19 18:26, Chris Li wrote: > On Thu, Jan 18, 2024 at 10:19 PM Chengming Zhou > <zhouchengming@xxxxxxxxxxxxx> wrote: >> >> On 2024/1/19 12:59, Chris Li wrote: >>> On Wed, Jan 17, 2024 at 11:35 PM Chengming Zhou >>> <zhouchengming@xxxxxxxxxxxxx> wrote: >>> >>>>>>> mm-stable zswap-split-tree zswap-xarray >>>>>>> real 1m10.442s 1m4.157s 1m9.962s >>>>>>> user 17m48.232s 17m41.477s 17m45.887s >>>>>>> sys 8m13.517s 5m2.226s 7m59.305s >>>>>>> >>>>>>> Looks like the contention of concurrency is still there, I haven't >>>>>>> look into the code yet, will review it later. >>>>> >>>>> Thanks for the quick test. Interesting to see the sys usage drop for >>>>> the xarray case even with the spin lock. >>>>> Not sure if the 13 second saving is statistically significant or not. >>>>> >>>>> We might need to have both xarray and split trees for the zswap. It is >>>>> likely removing the spin lock wouldn't be able to make up the 35% >>>>> difference. That is just my guess. There is only one way to find out. >>>> >>>> Yes, I totally agree with this! IMHO, concurrent zswap_store paths still >>>> have to contend for the xarray spinlock even though we would have converted >>>> the rb-tree to the xarray structure at last. So I think we should have both. >>>> >>>>> >>>>> BTW, do you have a script I can run to replicate your results? >>> >>> Hi Chengming, >>> >>> Thanks for your script. >>> >>>> >>>> ``` >>>> #!/bin/bash >>>> >>>> testname="build-kernel-tmpfs" >>>> cgroup="/sys/fs/cgroup/$testname" >>>> >>>> tmpdir="/tmp/vm-scalability-tmp" >>>> workdir="$tmpdir/$testname" >>>> >>>> memory_max="$((2 * 1024 * 1024 * 1024))" >>>> >>>> linux_src="/root/zcm/linux-6.6.tar.xz" >>>> NR_TASK=32 >>>> >>>> swapon ~/zcm/swapfile >>> >>> How big is your swapfile here? >> >> The swapfile is big enough here, I use a 50GB swapfile. > > Thanks, > >> >>> >>> It seems you have only one swapfile there. That can explain the contention. >>> Have you tried multiple swapfiles for the same test? >>> That should reduce the contention without using your patch. >> Do you mean to have many 64MB swapfiles to swapon at the same time? > > 64MB is too small. There are limits to MAX_SWAPFILES. It is less than > (32 - n) swap files. > If you want to use 50G swap space, you can have MAX_SWAPFILES, each > swapfile 50GB / MAX_SWAPFILES. Right. > >> Maybe it's feasible to test, > > Of course it is testable, I am curious to see the test results. > >> I'm not sure how swapout will choose. > > It will rotate through the same priority swap files first. > swapfile.c: get_swap_pages(). > >> But in our usecase, we normally have only one swapfile. > > Is there a good reason why you can't use more than one swapfile? I think no, but it seems an unneeded change/burden to our admin. So I just tested and optimized for the normal case. > One swapfile will not take the full advantage of the existing code. > Even if you split the zswap trees within a swapfile. With only one > swapfile, you will still be having lock contention on "(struct > swap_info_struct).lock". > It is one lock per swapfile. > Using more than one swap file should get you better results. IIUC, we already have the per-cpu swap entry cache to not contend for this lock? And I don't see much hot of this lock in the testing. Thanks.