On (25/02/01 17:21), Kairui Song wrote: > This seems will cause a huge regression of performance on multi core > systems, this is especially significant as the number of concurrent > tasks increases: > > Test build linux kernel using ZRAM as SWAP (1G memcg): > > Before: > + /usr/bin/time make -s -j48 > 2495.77user 2604.77system 2:12.95elapsed 3836%CPU (0avgtext+0avgdata > 863304maxresident)k > > After: > + /usr/bin/time make -s -j48 > 2403.60user 6676.09system 3:38.22elapsed 4160%CPU (0avgtext+0avgdata > 863276maxresident)k How many CPUs do you have? I assume, preemption gets into way which is sort of expected, to be honest... Using per-CPU compression streams disables preemption and uses CPU exclusively at a price of other tasks not being able to run. I do tend to think that I made a mistake by switching zram to per-CPU compression streams. What preemption model do you use and to what extent do you overload your system? My tests don't show anything unusual (but I don't overload the system) CONFIG_PREEMPT before 1371.96user 156.21system 1:30.91elapsed 1680%CPU (0avgtext+0avgdata 825636maxresident)k 32688inputs+1768416outputs (259major+51539861minor)pagefaults 0swaps after 1372.05user 155.79system 1:30.82elapsed 1682%CPU (0avgtext+0avgdata 825684maxresident)k 32680inputs+1768416outputs (273major+51541815minor)pagefaults 0swaps (I use zram as a block device with ext4 on it.) > `perf lock contention -ab sleep 3` also indicates the big spin lock in > zcomp_stream_get/put is having significant contention: Hmm it's just spin_lock() list first entry spin_unlock() Shouldn't be "a big spin lock", that's very odd. I'm not familiar with perf lock contention, let me take a look.