On 2023/12/28 16:03, Barry Song wrote: > On Wed, Dec 27, 2023 at 7:32 PM Chengming Zhou > <zhouchengming@xxxxxxxxxxxxx> wrote: >> >> On 2023/12/27 09:24, Barry Song wrote: >>> On Wed, Dec 27, 2023 at 4:56 AM Chengming Zhou >>> <zhouchengming@xxxxxxxxxxxxx> wrote: >>>> >>>> In the !zpool_can_sleep_mapped() case such as zsmalloc, we need to first >>>> copy the entry->handle memory to a temporary memory, which is allocated >>>> using kmalloc. >>>> >>>> Obviously we can reuse the per-compressor dstmem to avoid allocating >>>> every time, since it's percpu-compressor and protected in percpu mutex. >>> >>> what is the benefit of this since we are actually increasing lock contention >>> by reusing this buffer between multiple compression and decompression >>> threads? >> >> This mutex is already reused in all compress/decompress paths even before >> the reuse optimization. I think the best way maybe to use separate crypto_acomp >> for compression and decompression. >> >> Do you think the lock contention will be increased because we now put zpool_map_handle() >> and memcpy() in the lock section? Actually, we can move zpool_map_handle() before >> the lock section if needed, but that memcpy() should be protected in lock section. >> >>> >>> this mainly affects zsmalloc which can't sleep? do we have performance >>> data? >> >> Right, last time when test I remembered there is very minor performance difference. >> The main benefit here is to simply the code much and delete one failure case. > > ok. > > For the majority of hardware, people are using CPU-based > compression/decompression, > there is no chance they will sleep. Thus, all > compression/decompression can be done > in a zpool_map section, there is *NO* need to copy at all! Only for Yes, very good for zsmalloc. > those hardware which > can provide a HW-accelerator to offload CPU, crypto will actually wait > for completion by > > static inline int crypto_wait_req(int err, struct crypto_wait *wait) > { > switch (err) { > case -EINPROGRESS: > case -EBUSY: > wait_for_completion(&wait->completion); > reinit_completion(&wait->completion); > err = wait->err; > break; > } > > return err; > } > > for CPU-based alg, we have completed the compr/decompr within > crypto_acomp_decompress() > synchronously. they won't return EINPROGRESS, EBUSY. Ok, this is useful to know. > > The problem is that crypto_acomp won't expose this information to its > users. if it does, > we can use this info, we will totally avoid the code of copying > zsmalloc's data to a tmp > buffer for the most majority users of zswap. Agree, I think it's worthwhile to export, so zsmalloc users don't need to prepare the temporary buffer and copy in the majority case. Thanks! > > But I am not sure if we can find a way to convince Herbert(+To) :-) >