Re: [PATCH 1/5] mm/zswap: reuse dstmem when decompress

Chris Li <chrisl@xxxxxxxxxx> · Thu, 14 Dec 2023 14:02:11 -0800

Hi Yosry,

On Thu, Dec 14, 2023 at 10:27 AM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
>
> On Thu, Dec 14, 2023 at 9:59 AM Chris Li <chrisl@xxxxxxxxxx> wrote:
> >
> > On Tue, Dec 12, 2023 at 8:18 PM Chengming Zhou
> > <zhouchengming@xxxxxxxxxxxxx> wrote:
> > >
> > > In the !zpool_can_sleep_mapped() case such as zsmalloc, we need to first
> > > copy the entry->handle memory to a temporary memory, which is allocated
> > > using kmalloc.
> > >
> > > Obviously we can reuse the per-compressor dstmem to avoid allocating
> > > every time, since it's percpu-compressor and protected in mutex.
> >
> > You are trading more memory for faster speed.
> > Per-cpu data structure does not come free. It is expensive in terms of
> > memory on a big server with a lot of CPU. Think more than a few
> > hundred CPU. On the big servers, we might want to disable this
> > optimization to save a few MB RAM, depending on the gain of this
> > optimization.
> > Do we have any benchmark suggesting how much CPU overhead or latency
> > this per-CPU page buys us, compared to using kmalloc?
>
> IIUC we are not creating any new percpu data structures here. We are
> reusing existing percpu buffers used in the store path to compress
> into. Now we also use them in the load path if we need a temporary
> buffer to decompress into if the zpool backend does not support
> sleeping while the memory is mapped.

That sounds like pure win then. Thanks for explaining it.

Hi Nahn,

> I think Chengming is re-using an existing per-CPU buffer for this
> purpose. IIUC, it was previously only used for compression
> (zswap_store) - Chengming is leveraging it for decompression (load and
> writeback) too with this patch. This sounds fine to me tbh, because
> both directions have to hold the mutex anyway, so that buffer is
> locked out - might as well use it.

Agree.

Acked-by: Chris Li <chrisl@xxxxxxxxxx>

>
> We're doing a bit more work in the mutex section (memcpy and handle
> (un)mapping) - but seems fine to me tbh.

Thanks for the heads up.

Chris