On Mon, Mar 10, 2025 at 6:22 AM Qun-wei Lin (林群崴) <Qun-wei.Lin@xxxxxxxxxxxx> wrote: > > > Thank you for your explanation. Compared to the original single kswapd, > we expect t1 to have a slight increase in re-scan time. However, since > our kcompressd can focus on compression tasks and we can have multiple > kcompressd instances (kcompressd0, kcompressd1, ...) running in > parallel, we anticipate that the number of times a folio needs be re- > scanned will not be too many. > > In our experiments, we fixed the CPU and DRAM at a certain frequency. > We created a high memory pressure enviroment using a memory eater and > recorded the increase in pgsteal_anon per second, which was around 300, > 000. Then we applied our patch and measured again, that pgsteal_anon/s > increased to over 800,000. > > > > > > > > > > > > Problem: > > > > In the current system, the kswapd thread is responsible for both > > > > scanning the LRU pages and compressing pages into the ZRAM. This > > > > combined responsibility can lead to significant performance > > > > bottlenecks, > > > > > > What bottleneck are we talking about? Is one stage slower than the > > > other? > > > > > > > especially under high memory pressure. The kswapd thread becomes > > > > a > > > > single point of contention, causing delays in memory reclaiming > > > > and > > > > overall system performance degradation. > > > > > > > > Target: > > > > The target of this invention is to improve the efficiency of > > > > memory > > > > reclaiming. By separating the tasks of page scanning and page > > > > compression into distinct processes or threads, the system can > > > > handle > > > > memory pressure more effectively. > > > > > > I'm not a zram maintainer, so I'm definitely not trying to stop > > > this > > > patch. But whatever problem zram is facing will likely occur with > > > zswap too, so I'd like to learn more :) > > > > Right, this is likely something that could be addressed more > > generally > > for zswap and zram. > > > > Yes, we also hope to extend this to other swap devices, but currently, > we have only modified zram. We are not very familiar with zswap and > would like to ask if anyone has any suggestions for modifications? > My understanding is right now schedule_bio_write is the work submission API right? We can make it generic, having it accept a callback and a generic untyped pointer which can be casted into a backend-specific context struct. For zram it would contain struct zram and the bio. For zswap, depending on at which point do you want to begin offloading the work - it could simply be just the folio itself if we offload early, or a more complicated scheme. > > Thanks > > Barry > > Best Regards, > Qun-wei > >