On Wed, 28 Jun 2023 at 08:21, Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > On Mon, Jun 26, 2023 at 06:13:44PM +0800, Herbert Xu wrote: > > On Mon, Jun 26, 2023 at 12:03:04PM +0200, Ard Biesheuvel wrote: > > > > > > In any case, what I would like to see addressed is the horrid scomp to > > > acomp layer that ties up megabytes of memory in scratch space, just to > > > emulate the acomp interface on top of scomp drivers, while no code > > > exists that makes use of the async nature. Do you have an idea on how > > > we might address this particular issue? > > > > The whole reason why need to allocate megabytes of memory is because > > of the lack of SG lists in the underlying algorithm. If they > > actually used SG lists and allocated pages as they went during > > decompression, then we wouldn't need to pre-allocate any memory > > at all. > > I don't think that is a realistic expectation. Decompressors generally need a > contiguous buffer for decompressed data anyway, up to a certain size which is > 32KB for DEFLATE but can be much larger for the more modern algorithms. This is > because they decode "matches" that refer to previously decompressed data by > offset, and it has to be possible to index the data efficiently. > > (Some decompressors, e.g. zlib, provide "streaming" APIs where you can read > arbitrary amounts. But that works by actually decompressing into an internal > buffer that has sufficient size, then copying to the user provided buffer.) > > The same applies to compressors too, with regards to the original data. > > I think the "input/output is a list of pages" model just fundamentally does not > work well for software compression and decompression. To support it, either > large temporary buffers are needed (they might be hidden inside the > (de)compressor, but they are there), or vmap() or vm_map_ram() is needed. > > FWIW, f2fs compression uses vm_map_ram() and skips the crypto API entirely... > > If acomp has to be kept for the hardware support, then maybe its scomp backend > should use vm_map_ram() instead of scratch buffers? > Yeah, but we'll run into similar issues related to the fact that scatterlists can describe arbitrary sequences of sub-page size memory chunks, which means vmap()ing the pages may not be sufficient to get a virtual linear representation of the buffers. With zswap being the only current user, which uses a single contiguous buffers for decompression out of place, and blocks on the completion, the level of additional complexity we have in the acomp stack is mind boggling. And the scomp-to-acomp adaptation layer, with its fixed size per-CPU in and output buffer (implying that acomp in/output has a hardcoded size limit) which are never freed makes it rather unpalatable to me tbh.