Re: [PATCH 0/2] Improve Zram by separating compression context from kswapd

Nhat Pham <nphamcs@xxxxxxxxx> · Mon, 10 Mar 2025 09:58:03 -0700

On Mon, Mar 10, 2025 at 6:22 AM Qun-wei Lin (林群崴)
<Qun-wei.Lin@xxxxxxxxxxxx> wrote:
>
>
> Thank you for your explanation. Compared to the original single kswapd,
> we expect t1 to have a slight increase in re-scan time. However, since
> our kcompressd can focus on compression tasks and we can have multiple
> kcompressd instances (kcompressd0, kcompressd1, ...) running in
> parallel, we anticipate that the number of times a folio needs be re-
> scanned will not be too many.
>
> In our experiments, we fixed the CPU and DRAM at a certain frequency.
> We created a high memory pressure enviroment using a memory eater and
> recorded the increase in pgsteal_anon per second, which was around 300,
> 000. Then we applied our patch and measured again, that pgsteal_anon/s
> increased to over 800,000.
>
> > >
> > > >
> > > > Problem:
> > > >  In the current system, the kswapd thread is responsible for both
> > > >  scanning the LRU pages and compressing pages into the ZRAM. This
> > > >  combined responsibility can lead to significant performance
> > > > bottlenecks,
> > >
> > > What bottleneck are we talking about? Is one stage slower than the
> > > other?
> > >
> > > >  especially under high memory pressure. The kswapd thread becomes
> > > > a
> > > >  single point of contention, causing delays in memory reclaiming
> > > > and
> > > >  overall system performance degradation.
> > > >
> > > > Target:
> > > >  The target of this invention is to improve the efficiency of
> > > > memory
> > > >  reclaiming. By separating the tasks of page scanning and page
> > > >  compression into distinct processes or threads, the system can
> > > > handle
> > > >  memory pressure more effectively.
> > >
> > > I'm not a zram maintainer, so I'm definitely not trying to stop
> > > this
> > > patch. But whatever problem zram is facing will likely occur with
> > > zswap too, so I'd like to learn more :)
> >
> > Right, this is likely something that could be addressed more
> > generally
> > for zswap and zram.
> >
>
> Yes, we also hope to extend this to other swap devices, but currently,
> we have only modified zram. We are not very familiar with zswap and
> would like to ask if anyone has any suggestions for modifications?
>

My understanding is right now schedule_bio_write is the work
submission API right? We can make it generic, having it accept a
callback and a generic untyped pointer which can be casted into a
backend-specific context struct. For zram it would contain struct zram
and the bio. For zswap, depending on at which point do you want to
begin offloading the work - it could simply be just the folio itself
if we offload early, or a more complicated scheme.

> > Thanks
> > Barry
>
> Best Regards,
> Qun-wei
>
>