On Wed, Dec 27, 2023 at 5:16 PM Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> wrote: > > On 2023/12/27 15:01, Barry Song wrote: > > On Wed, Dec 27, 2023 at 7:38 PM Chengming Zhou > > <zhouchengming@xxxxxxxxxxxxx> wrote: > >> > >> On 2023/12/27 14:25, Barry Song wrote: > >>> On Wed, Dec 27, 2023 at 4:51 PM Chengming Zhou > >>> <zhouchengming@xxxxxxxxxxxxx> wrote: > >>>> > >>>> On 2023/12/27 08:23, Nhat Pham wrote: > >>>>> On Tue, Dec 26, 2023 at 3:30 PM Chris Li <chrisl@xxxxxxxxxx> wrote: > >>>>>> > >>>>>> Again, sorry I was looking at the decompression side rather than the > >>>>>> compression side. The compression side does not even offer a safe > >>>>>> version of the compression function. > >>>>>> That seems to be dangerous. It seems for now we should make the zswap > >>>>>> roll back to 2 page buffer until we have a safe way to do compression > >>>>>> without overwriting the output buffers. > >>>>> > >>>>> Unfortunately, I think this is the way - at least until we rework the > >>>>> crypto/compression API (if that's even possible?). > >>>>> I still think the 2 page buffer is dumb, but it is what it is :( > >>>> > >>>> Hi, > >>>> > >>>> I think it's a bug in `scomp_acomp_comp_decomp()`, which doesn't use > >>>> the caller passed "src" and "dst" scatterlist. Instead, it uses its own > >>>> per-cpu "scomp_scratch", which have 128KB src and dst. > >>>> > >>>> When compression done, it uses the output req->dlen to copy scomp_scratch->dst > >>>> to our dstmem, which has only one page now, so this problem happened. > >>>> > >>>> I still don't know why the alg->compress(src, slen, dst, &dlen) doesn't > >>>> check the dlen? It seems an obvious bug, right? > >>>> > >>>> As for this problem in `scomp_acomp_comp_decomp()`, this patch below > >>>> should fix it. I will set up a few tests to check later. > >>>> > >>>> Thanks! > >>>> > >>>> diff --git a/crypto/scompress.c b/crypto/scompress.c > >>>> index 442a82c9de7d..e654a120ae5a 100644 > >>>> --- a/crypto/scompress.c > >>>> +++ b/crypto/scompress.c > >>>> @@ -117,6 +117,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) > >>>> struct crypto_scomp *scomp = *tfm_ctx; > >>>> void **ctx = acomp_request_ctx(req); > >>>> struct scomp_scratch *scratch; > >>>> + unsigned int dlen; > >>>> int ret; > >>>> > >>>> if (!req->src || !req->slen || req->slen > SCOMP_SCRATCH_SIZE) > >>>> @@ -128,6 +129,8 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) > >>>> if (!req->dlen || req->dlen > SCOMP_SCRATCH_SIZE) > >>>> req->dlen = SCOMP_SCRATCH_SIZE; > >>>> > >>>> + dlen = req->dlen; > >>>> + > >>>> scratch = raw_cpu_ptr(&scomp_scratch); > >>>> spin_lock(&scratch->lock); > >>>> > >>>> @@ -145,6 +148,9 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) > >>>> ret = -ENOMEM; > >>>> goto out; > >>>> } > >>>> + } else if (req->dlen > dlen) { > >>>> + ret = -ENOMEM; > >>>> + goto out; > >>>> } > >>> > >>> This can't fix the problem as crypto_scomp_compress() has written overflow data. > >> > >> No, crypto_scomp_compress() writes to its own scomp_scratch->dst memory, then copy > >> to our dstmem. > >> > >>> > >>> BTW, in many cases, hardware-accelerators drivers/crypto can do compression and > >>> decompression by off-loading CPU; > >>> we won't have a chance to let hardware check the dst buffer size. so > >>> giving the dst buffer > >>> with enough length to the hardware's dma engine is the right way. I > >>> mean, we shouldn't > >>> change dst from 2pages to 1page. > >> > >> But how do we know 2 pages is enough for any compression algorithm? > >> > > > > There is no guarrette 2 pages is enough. > > > > We can invent our own compression algorithm, in our algorithm, we can > > add a lot of metadata, for example, longer than 4KB when the source data > > is uncompress-able. then dst can be larger than 2 * PAGE_SIZE. but this > > is not the case :-) This kind of algorithm may never succeed. > > > > For those in-tree algorithms, we have a WORST_COMPR_FACTOR. in > > ubifs, zram and zswap, we all use "2" as the worst comp factor. > > Thanks for your explanation! Maybe it's best for us to return to 2 pages > if no other people's comments. And this really need more documentation :-) I agree. we need some doc. besides, i actually think we can skip zswap frontend if over-compression is really happening. --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1318,7 +1318,7 @@ bool zswap_store(struct folio *folio) if (zpool_malloc_support_movable(zpool)) gfp |= __GFP_HIGHMEM | __GFP_MOVABLE; ret = zpool_malloc(zpool, dlen, gfp, &handle); - if (ret == -ENOSPC) { + if (ret == -ENOSPC || dlen > PAGE_SIZE) { zswap_reject_compress_poor++; goto put_dstmem; } since we are not saving but wasting space in this corner case? > since there is no any comment or check in the acomp compress interface. > > /* > * @src: Source Data > * @dst: Destination data > * @slen: Size of the input buffer > * @dlen: Size of the output buffer and number of bytes produced > * @flags: Internal flags > * @__ctx: Start of private context data > */ > struct acomp_req { > struct crypto_async_request base; > struct scatterlist *src; > struct scatterlist *dst; > unsigned int slen; > unsigned int dlen; > u32 flags; > void *__ctx[] CRYPTO_MINALIGN_ATTR; > }; > > > > > /* > > * Some compressors, like LZO, may end up with more data then the input buffer. > > * So UBIFS always allocates larger output buffer, to be sure the compressor > > * will not corrupt memory in case of worst case compression. > > */ > > #define WORST_COMPR_FACTOR 2 > > > > #ifdef CONFIG_FS_ENCRYPTION > > #define UBIFS_CIPHER_BLOCK_SIZE FSCRYPT_CONTENTS_ALIGNMENT > > #else > > #define UBIFS_CIPHER_BLOCK_SIZE 0 > > #endif > > > > /* > > * How much memory is needed for a buffer where we compress a data node. > > */ > > #define COMPRESSED_DATA_NODE_BUF_SZ \ > > (UBIFS_DATA_NODE_SZ + UBIFS_BLOCK_SIZE * WORST_COMPR_FACTOR) > > > > > > For years, we have never seen 2 pages that can be a problem. but 1 > > page is definitely > > not enough, I remember I once saw many cases where accelerators' dmaengine > > can write more than 1 page. > > > >> Thanks. > >> > >>> > >>>> scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen, > >>>> 1); > >>> > >>> > Thanks Barry