On (24/01/06 15:38), Barry Song wrote: > On Sat, Jan 6, 2024 at 9:30 AM Sergey Senozhatsky > <senozhatsky@xxxxxxxxxxxx> wrote: > > > > On (24/01/03 13:30), Barry Song wrote: > > > There is no need to keep zcomp_strm's buffers contiguous physically. > > > And rarely, 1-order allocation can fail while buddy is seriously > > > fragmented. > > > > Dunno. Some of these don't sound like convincing reasons, I'm afraid. > > We don't allocate compression streams all the time, we do it once > > per-CPU. And if the system is under such a terrible memory pressure > > We actually do it many times actually because we free it while unplugging and > re-allocate it during hotplugging. this can happen quite often for systems like > Android using hotplug for power management. Okay, makes sense. Do you see these problems in real life? I don't recall any reports. > > then one probably should not use zram at all, because zsmalloc needs > > pages for its pool. > > In my humble opinion, 1-order allocation and 0-order allocation are different > things, 1-order is still more difficult though it is easier than > 2-order which was > a big pain causing allocation latency for tasks' kernel stacks and negatively > affecting user experience. it has now been replaced by vmalloc and makes > life easier :-) Sure. > > I also wonder whether Android uses HW compression, in which case we > > may need to have physically contig pages. Not to mention TLB shootdowns > > that virt contig pages add to the picture. > > I don't understand how HW compression and TLB shootdown are related as zRAM > is using a traditional comp API. Oh, those are not related. TLB shootdowns are what now will be added to all compressions/decompressions, so it's sort of extra cost. HW compression (which android may be doing?) is another story. Did you run any perf tests on zram w/ and w/o the patch? > We are always passing a virtual address, traditional HW drivers use their own > buffers to do DMA. > > int crypto_comp_compress(struct crypto_comp *comp, > const u8 *src, unsigned int slen, > u8 *dst, unsigned int *dlen); > int crypto_comp_decompress(struct crypto_comp *comp, > const u8 *src, unsigned int slen, > u8 *dst, unsigned int *dlen); > > In new acomp API, we are passing a sg - users' buffers to drivers directly, > sg_init_one(&input, src, entry->length); > sg_init_table(&output, 1); > sg_set_page(&output, page, PAGE_SIZE, 0); > acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen); > ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), > &acomp_ctx->wait); > > but i agree one-nents sg might have some advantage in scompress case Right. > after we move > to new acomp APIs if we have this patch I sent recently [patch 3/3], > https://lore.kernel.org/linux-mm/20240103095006.608744-1-21cnbao@xxxxxxxxx/ Nice.