Hi Minchan, On Wed, Oct 30, 2019 at 1:10 AM Minchan Kim <minchan@xxxxxxxxxx> wrote: <snip> > > > I ran fio on x86 with various compression sizes. > > left is zsmalloc. right is z3fold > > The operation order is > seq-write > rand-write > seq-read > rand-read > mixed-seq > mixed-rand > trim > mem_used - byte unit > > Last column mem_used is to indicate how many allocator used the memory > to store compressed page > > 1) compression ratio 75 > > WRITE 2535 WRITE 1928 > WRITE 2425 WRITE 1886 > READ 6211 READ 5731 > READ 6339 READ 6182 > READ 1791 READ 1592 > WRITE 1790 WRITE 1591 > READ 1704 READ 1493 > WRITE 1699 WRITE 1489 > WRITE 984 WRITE 974 > TRIM 984 TRIM 974 > mem_used 29986816 mem_used 61239296 > > For every operation, zsmalloc is faster than z3fold. > Even, it used the 1/2 memory compared to z3fold. > > 2) compression ratio 66 > > WRITE 2125 WRITE 1258 > WRITE 2107 WRITE 1233 > READ 5714 READ 5793 > READ 5948 READ 6065 > READ 1667 READ 1248 > WRITE 1666 WRITE 1247 > READ 1521 READ 1218 > WRITE 1517 WRITE 1215 > WRITE 943 WRITE 870 > TRIM 943 TRIM 870 > mem_used 38158336 mem_used 76779520 > > For only read operation, z3fold is a bit faster than zsmalloc about 2%. > However, look at other operations which zsmalloc is much faster. > Even, look at used memory. > > 3) compression ratio 50 > > WRITE 2051 WRITE 1109 > WRITE 2029 WRITE 1087 > READ 5366 READ 6364 > READ 5575 READ 5785 > READ 1497 READ 1121 > WRITE 1496 WRITE 1121 > READ 1432 READ 1065 > WRITE 1428 WRITE 1062 > WRITE 930 WRITE 838 > TRIM 930 TRIM 838 > mem_used 59932672 mem_used 104873984 > > sequential read on z3fold is faster about 15%. However, look at other > operations and used memory. zsmalloc is better. There are two things to this: the measurements you've taken as such and how they are relevant to this discussion. I'd be happy to discuss these measurements in a separate thread if you specified more precisely what kind of x86 the measurements were taken on. However, my point was that there are rather common cases when people want to use z3fold as a zRAM memory allocation backend. The fact that there are other cases when people wouldn't want that is pretty natural and doesn't need a proof. That's why I propose to use ZRAM over zpool API for the sake of flexibility. That would benefit various users of ZRAM and, at the end of the day, the Linux kernel ecosystem. <snip> > Thanks for the testing. I also tried to test zbud with zram but failed because fio > submit incompressible pages to zram even though it specifiy compress ratio 100% > However, zbud doesn't support 4K page allocation so zram couldn't work on it > at this moment. I tried various fio versions as well as old but everything failed. > > How did you test it successfully? Let me know your fio version. > I want to investigate what's the performance bottleneck beside page copy > so that I will optimize it. You're very welcome. :) The patch to make zbud accept PAGE_SIZE pages has been posted a while ago [1] and it was a part of our previous (pre-z3fold) discussion on the same subject but you probably haven't read it then. > > > > Now to the fun part. > > zsmalloc: > > 0 .text 00002908 0000000000000000 0000000000000000 00000040 2**2 > > CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE > > zbud: > > 0 .text 0000072c 0000000000000000 0000000000000000 00000040 2**2 > > CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE > > > > And this does not cover dynamic memory allocation overhead which is > > higher for zsmalloc. So once again, given that the compression ratio > > is low (e. g. a simple HW accelerator is used), what would most > > unbiased people prefer to use in this case? > > Zsmalloc has more features than zbud. That's why you see the code size > difference. It was intentional because at that time most of users were > mobile phones, TV and other smart devices. They needed those features. > > We could make those feature turned off at build time, which will improve > performance and reduce code size a lot. It would be no problem if the > user wanted to use zbud which is alredy lacking of those features. I do support this idea and would like to help as much as I can, but why should the people who want to use ZRAM/zbud combo be left stranded while we're working on reducing the zsmalloc code size by 4x? With that said, let me also re-iterate that there may be more allocators coming, and in some cases zsmalloc won't be a good fit/alternative while there will be still a need for a compressed RAM device. I hope you understand. Best regards, Vitaly [1] https://lore.kernel.org/patchwork/patch/598210/