Hi Yosry, On Thu, Dec 14, 2023 at 10:27 AM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: > > On Thu, Dec 14, 2023 at 9:59 AM Chris Li <chrisl@xxxxxxxxxx> wrote: > > > > On Tue, Dec 12, 2023 at 8:18 PM Chengming Zhou > > <zhouchengming@xxxxxxxxxxxxx> wrote: > > > > > > In the !zpool_can_sleep_mapped() case such as zsmalloc, we need to first > > > copy the entry->handle memory to a temporary memory, which is allocated > > > using kmalloc. > > > > > > Obviously we can reuse the per-compressor dstmem to avoid allocating > > > every time, since it's percpu-compressor and protected in mutex. > > > > You are trading more memory for faster speed. > > Per-cpu data structure does not come free. It is expensive in terms of > > memory on a big server with a lot of CPU. Think more than a few > > hundred CPU. On the big servers, we might want to disable this > > optimization to save a few MB RAM, depending on the gain of this > > optimization. > > Do we have any benchmark suggesting how much CPU overhead or latency > > this per-CPU page buys us, compared to using kmalloc? > > IIUC we are not creating any new percpu data structures here. We are > reusing existing percpu buffers used in the store path to compress > into. Now we also use them in the load path if we need a temporary > buffer to decompress into if the zpool backend does not support > sleeping while the memory is mapped. That sounds like pure win then. Thanks for explaining it. Hi Nahn, > I think Chengming is re-using an existing per-CPU buffer for this > purpose. IIUC, it was previously only used for compression > (zswap_store) - Chengming is leveraging it for decompression (load and > writeback) too with this patch. This sounds fine to me tbh, because > both directions have to hold the mutex anyway, so that buffer is > locked out - might as well use it. Agree. Acked-by: Chris Li <chrisl@xxxxxxxxxx> > > We're doing a bit more work in the mutex section (memcpy and handle > (un)mapping) - but seems fine to me tbh. Thanks for the heads up. Chris