On Thu, Dec 14, 2023 at 9:59 AM Chris Li <chrisl@xxxxxxxxxx> wrote: > > On Tue, Dec 12, 2023 at 8:18 PM Chengming Zhou > <zhouchengming@xxxxxxxxxxxxx> wrote: > > > > In the !zpool_can_sleep_mapped() case such as zsmalloc, we need to first > > copy the entry->handle memory to a temporary memory, which is allocated > > using kmalloc. > > > > Obviously we can reuse the per-compressor dstmem to avoid allocating > > every time, since it's percpu-compressor and protected in mutex. > > You are trading more memory for faster speed. > Per-cpu data structure does not come free. It is expensive in terms of > memory on a big server with a lot of CPU. Think more than a few > hundred CPU. On the big servers, we might want to disable this > optimization to save a few MB RAM, depending on the gain of this > optimization. > Do we have any benchmark suggesting how much CPU overhead or latency > this per-CPU page buys us, compared to using kmalloc? IIUC we are not creating any new percpu data structures here. We are reusing existing percpu buffers used in the store path to compress into. Now we also use them in the load path if we need a temporary buffer to decompress into if the zpool backend does not support sleeping while the memory is mapped.