On 2024/1/3 14:53, Barry Song wrote: > On Wed, Dec 27, 2023 at 7:38 PM Chengming Zhou > <zhouchengming@xxxxxxxxxxxxx> wrote: >> >> On 2023/12/27 14:25, Barry Song wrote: >>> On Wed, Dec 27, 2023 at 4:51 PM Chengming Zhou >>> <zhouchengming@xxxxxxxxxxxxx> wrote: >>>> >>>> On 2023/12/27 08:23, Nhat Pham wrote: >>>>> On Tue, Dec 26, 2023 at 3:30 PM Chris Li <chrisl@xxxxxxxxxx> wrote: >>>>>> >>>>>> Again, sorry I was looking at the decompression side rather than the >>>>>> compression side. The compression side does not even offer a safe >>>>>> version of the compression function. >>>>>> That seems to be dangerous. It seems for now we should make the zswap >>>>>> roll back to 2 page buffer until we have a safe way to do compression >>>>>> without overwriting the output buffers. >>>>> >>>>> Unfortunately, I think this is the way - at least until we rework the >>>>> crypto/compression API (if that's even possible?). >>>>> I still think the 2 page buffer is dumb, but it is what it is :( >>>> >>>> Hi, >>>> >>>> I think it's a bug in `scomp_acomp_comp_decomp()`, which doesn't use >>>> the caller passed "src" and "dst" scatterlist. Instead, it uses its own >>>> per-cpu "scomp_scratch", which have 128KB src and dst. >>>> >>>> When compression done, it uses the output req->dlen to copy scomp_scratch->dst >>>> to our dstmem, which has only one page now, so this problem happened. >>>> >>>> I still don't know why the alg->compress(src, slen, dst, &dlen) doesn't >>>> check the dlen? It seems an obvious bug, right? >>>> >>>> As for this problem in `scomp_acomp_comp_decomp()`, this patch below >>>> should fix it. I will set up a few tests to check later. >>>> >>>> Thanks! >>>> >>>> diff --git a/crypto/scompress.c b/crypto/scompress.c >>>> index 442a82c9de7d..e654a120ae5a 100644 >>>> --- a/crypto/scompress.c >>>> +++ b/crypto/scompress.c >>>> @@ -117,6 +117,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) >>>> struct crypto_scomp *scomp = *tfm_ctx; >>>> void **ctx = acomp_request_ctx(req); >>>> struct scomp_scratch *scratch; >>>> + unsigned int dlen; >>>> int ret; >>>> >>>> if (!req->src || !req->slen || req->slen > SCOMP_SCRATCH_SIZE) >>>> @@ -128,6 +129,8 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) >>>> if (!req->dlen || req->dlen > SCOMP_SCRATCH_SIZE) >>>> req->dlen = SCOMP_SCRATCH_SIZE; >>>> >>>> + dlen = req->dlen; >>>> + >>>> scratch = raw_cpu_ptr(&scomp_scratch); >>>> spin_lock(&scratch->lock); >>>> >>>> @@ -145,6 +148,9 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) >>>> ret = -ENOMEM; >>>> goto out; >>>> } >>>> + } else if (req->dlen > dlen) { >>>> + ret = -ENOMEM; >>>> + goto out; >>>> } >>> >>> This can't fix the problem as crypto_scomp_compress() has written overflow data. >> >> No, crypto_scomp_compress() writes to its own scomp_scratch->dst memory, then copy >> to our dstmem. > > Hi Chengming, > I still feel these two memcpys are too big and unnecessary, so i sent > a RFC[1] to remove > them as well as another one removing memcpy in zswap[2]. > but unfortunately I don't have real hardware to run and collect data, > I wonder if you are > interested in testing and collecting data as you are actively > contributing to zswap. Ok, I just tested these three patches on my server, found improvement in the kernel build testcase on a tmpfs with zswap (lz4 + zsmalloc) enabled. mm-stable 501a06fe8e4c patched real 1m38.028s 1m32.317s user 19m11.482s 18m39.439s sys 19m26.445s 17m5.646s The improvement looks good! So feel free to add: Tested-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> Thanks. > > [1] https://lore.kernel.org/linux-mm/20240103053134.564457-1-21cnbao@xxxxxxxxx/ > [2] > https://lore.kernel.org/linux-mm/20240103025759.523120-1-21cnbao@xxxxxxxxx/ > https://lore.kernel.org/linux-mm/20240103025759.523120-2-21cnbao@xxxxxxxxx/ > > Thanks > Barry