Re: + zram-do-not-use-per-cpu-compression-streams.patch added to mm-unstable branch

Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx> · Thu, 30 Jan 2025 19:00:17 +0900

On (25/01/27 12:58), Andrew Morton wrote:
> The patch titled
>      Subject: zram: do not use per-CPU compression streams
> has been added to the -mm mm-unstable branch.  Its filename is
>      zram-do-not-use-per-cpu-compression-streams.patch
> 
> This patch will shortly appear at
>      https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/zram-do-not-use-per-cpu-compression-streams.patch
> 
> This patch will later appear in the mm-unstable branch at
>     git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> 
> Before you just go and hit "reply", please:
>    a) Consider who else should be cc'ed
>    b) Prefer to cc a suitable mailing list as well
>    c) Ideally: find the original patch on the mailing list and do a
>       reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
> 
> The -mm tree is included into linux-next via the mm-everything
> branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> and is updated there every 2-3 working days
> 
> ------------------------------------------------------
> From: Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx>
> Subject: zram: do not use per-CPU compression streams
> Date: Mon, 27 Jan 2025 16:29:13 +0900
> 
> Similarly to per-entry spin-lock per-CPU compression streams also have a
> number of shortcoming.
> 
> First, per-CPU stream access has to be done from a non-preemptible
> (atomic) section, which imposes the same atomicity requirements on
> compression backends as entry spin-lock do and makes it impossible to use
> algorithms that can schedule/wait/sleep during compression and
> decompression.
> 
> Second, per-CPU streams noticeably increase memory usage (actually more
> like wastage) of secondary compression streams.  The problem is that
> secondary compression streams are allocated per-CPU, just like the primary
> streams are.  Yet we never use more that one secondary stream at a time,
> because recompression is a single threaded action.  Which means that
> remaining num_online_cpu() - 1 streams are allocated for nothing, and this
> is per-priority list (we can have several secondary compression
> algorithms).  Depending on the algorithm this may lead to a significant
> memory wastage, in addition each stream also carries a workmem buffer (2
> physical pages).
> 
> Instead of per-CPU streams, maintain a list of idle compression streams
> and allocate new streams on-demand (something that we used to do many
> years ago).  So that zram read() and write() become non-atomic and ease
> requirements on the compression algorithm implementation.  This also means
> that we now should have only one secondary stream per-priority list.
> 
> Link: https://lkml.kernel.org/r/20250127072932.1289973-3-senozhatsky@xxxxxxxxxxxx

Andrew, I will send an updated version of this entire series.