fio command used in all tests over zram disk with lz4 compressor:
fio --bs=4k --randrepeat=1 --randseed=100 --refill_buffers \
--buffer_compress_percentage=VALUE --scramble_buffers=1 \
--direct=1 --loops=15 --numjobs=4 --filename=/dev/zram0 \
--name=seq-write --rw=write --stonewall --name=seq-read \
--rw=read --stonewall --name=seq-readwrite --rw=rw --stonewall \
--name=rand-readwrite --rw=randrw --stonewall
where VALUE=30/50/70 depending on particular test
1. Average results: qemu simulation of cortex-a53 on x86 pc;
buffer_compress_page 30/50; disk size 128/192/256 Mb
z3fold
WRITE: 42 MB/s
READ: 189 MB/s
READ/WRITE: 31 MB/s
READ/WRITE: 29 MB/s
Compression ratio: 1.5
zsmalloc
WRITE: 88 MB/s
READ: 187 MB/s
READ/WRITE: 54 MB/s
READ/WRITE: 50 MB/s
Compression ratio: 2.4
ztree
WRITE: 94.5 MB/s
READ: 206 MB/s
READ/WRITE: 59.5 MB/s
READ/WRITE: 56 MB/s
Compression ratio: 2.0
2. Average results: armhf Raspberry Pi 3 model B;
buffer_compress_page 50; disk size 256/512 Mb
z3fold
WRITE: 120 MB/s
READ: 240 MB/s
READ/WRITE: 85 MB/s
READ/WRITE: 80 MB/s
Compression ratio: 2.0
zsmalloc
WRITE: 127 MB/s
READ: 296 MB/s
READ/WRITE: 91 MB/s
READ/WRITE: 84 MB/s
Compression ratio: 2.5
ztree
WRITE: 132 MB/s
READ: 275 MB/s
READ/WRITE: 94 MB/s
READ/WRITE: 90 MB/s
Compression ratio: 2.05
3. Average results: arm64 Raspberry Pi 4 model B;
buffer_compress_page 50/70; disk size 256/512/1024 Mb
z3fold
WRITE: 367 MB/s
READ: 1378 MB/s
READ/WRITE: 265 MB/s
READ/WRITE: 254 MB/s
Compression ratio: 1.4
zsmalloc
WRITE: 595 MB/s
READ: 1397 MB/s
READ/WRITE: 407 MB/s
READ/WRITE: 372 MB/s
Compression ratio: 2.2
ztree
WRITE: 650 MB/s
READ: 1282 MB/s
READ/WRITE: 400 MB/s
READ/WRITE: 381 MB/s
Compression ratio: 1.8
In real cases perfomance could be higher since fio gives extremely uneven load on trees:
mainly one tree is loaded from 16.
16.09.2021, 13:38, "Vitaly Wool" <vitaly.wool@xxxxxxxxxxxx>:
On Thu, Sep 16, 2021 at 12:12 PM Vlastimil Babka <vbabka@xxxxxxx> wrote:
On 9/16/21 10:51, Ananda Badmaev wrote:
> ztree is a versatile backend for zswap and potentially zram. It got its name
> due to the usage of red-black trees to store blocks of compressed objects.
> These blocks consist of several consecutive pages and ztree keeps an integer
> number of objects per block.
>
> For zram, ztree has better worst case malloc() and free() times than zsmalloc,
> does not deteriorate over time and has slightly worse but comparable compression
> ratio. For zswap, ztree has better worst case malloc() and free() times than
> z3fold, better compression ratio than z3fold and supports reclaim unlike zsmalloc.
>
> Signed-off-by: Ananda Badmaev <a.badmaev@xxxxxxxxxxxx>
So how many of these allocators do we need? Minimally IMHO some data should
be provided for the performance comparison claims above.
Sounds like if this is based on z3fold (I haven't actually compared the
code) and better in every aspect, why not just "upgrade" z3fold to ztree then?
We have collected a lot of data and it wouldn't fit in the cover
message. I believe Ananda will follow up with comparison details on
various architectures.
I wouldn't say that ztree is completely based on z3fold, the latter
might have served as an inspiration and ztree shares the idea that
keeping an integral amount of objects per page is a good thing. With
that said, ztree operates on blocks, not on pages which allows for
more flexibility.
~Vitaly