On Mon, Feb 03, 2025 at 12:49:42PM +0900, Sergey Senozhatsky wrote: > On (25/02/01 17:21), Kairui Song wrote: > > This seems will cause a huge regression of performance on multi core > > systems, this is especially significant as the number of concurrent > > tasks increases: > > > > Test build linux kernel using ZRAM as SWAP (1G memcg): > > > > Before: > > + /usr/bin/time make -s -j48 > > 2495.77user 2604.77system 2:12.95elapsed 3836%CPU (0avgtext+0avgdata > > 863304maxresident)k > > > > After: > > + /usr/bin/time make -s -j48 > > 2403.60user 6676.09system 3:38.22elapsed 4160%CPU (0avgtext+0avgdata > > 863276maxresident)k > > How many CPUs do you have? I assume, preemption gets into way which is > sort of expected, to be honest... Using per-CPU compression streams > disables preemption and uses CPU exclusively at a price of other tasks > not being able to run. I do tend to think that I made a mistake by > switching zram to per-CPU compression streams. FWIW, I am not familiar at all with the zram code but zswap uses per-CPU acomp contexts with a mutex instead of a spinlock. So the task uses the context of the CPU that it started on, but it can be preempted or migrated and end up running on a different CPU. This means that contention is still possible, but probably much lower than having a shared pool of contexts that all CPUs compete on. Again, this could be irrelevant as I am not very familiar with the zram code, just thought this may be useful.