Re: [PATCH v3] mm: add ztree - new allocator for use via zpool API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 07, 2022 at 05:27:24PM +0300, Ananda wrote:
> From: Ananda Badmaev <a.badmaev@xxxxxxxxxxxx>
> 
>     Ztree stores integer number of compressed objects per ztree block.
> These blocks consist of several physical pages (from 1 to 8) and are
> arranged in trees.
>     The range from 0 to PAGE_SIZE is divided into the number of intervals
> corresponding to the number of trees and each tree only operates objects of
> size from its interval. Thus the block trees are isolated from each other,
> which makes it possible to simultaneously perform actions with several
> objects from different trees.
>     Blocks make it possible to densely arrange objects of various sizes
> resulting in low internal fragmentation. Also this allocator tries to fill
> incomplete blocks instead of adding new ones thus in many cases providing a
> compression ratio substantially higher than z3fold and zbud.
>     Apart from greater flexibility, ztree is significantly superior to other
> zpool backends with regard to the worst execution times, thus allowing for
> better response time and real-time characteristics of the whole system.
> 
> Signed-off-by: Ananda Badmaev <a.badmaev@xxxxxxxxxxxx>
> ---
> 
> v2: fixed compiler warnings
> 
> v3: added documentation and const modifier to struct tree_descr
> 
>  Documentation/vm/ztree.rst | 104 +++++
>  MAINTAINERS                |   7 +
>  mm/Kconfig                 |  18 +
>  mm/Makefile                |   1 +
>  mm/ztree.c                 | 754 +++++++++++++++++++++++++++++++++++++
>  5 files changed, 884 insertions(+)
>  create mode 100644 Documentation/vm/ztree.rst
>  create mode 100644 mm/ztree.c

There are a lot of style issues, please run scripts/checkpatch.pl.
 
> diff --git a/Documentation/vm/ztree.rst b/Documentation/vm/ztree.rst
> new file mode 100644
> index 000000000000..78cad0a6d616
> --- /dev/null
> +++ b/Documentation/vm/ztree.rst
> @@ -0,0 +1,104 @@
> +.. _ztree:
> +
> +=====
> +ztree
> +=====
> +
> +Ztree stores integer number of compressed objects per ztree block. These
> +blocks consist of several consecutive physical pages (from 1 to 8) and
> +are arranged in trees. The range from 0 to PAGE_SIZE is divided into the
> +number of intervals corresponding to the number of trees and each tree
> +only operates objects of size from its interval. Thus the block trees are
> +isolated from each other, which makes it possible to simultaneously
> +perform actions with several objects from different trees.
> +
> +Blocks make it possible to densely arrange objects of various sizes
> +resulting in low internal fragmentation. Also this allocator tries to fill
> +incomplete blocks instead of adding new ones thus in many cases providing
> +a compression ratio substantially higher than z3fold and zbud. Apart from
> +greater flexibility, ztree is significantly superior to other zpool
> +backends with regard to the worst execution times, thus allowing for better
> +response time and real-time characteristics of the whole system.
> +
> +Like z3fold and zsmalloc ztree_alloc() does not return a dereferenceable
> +pointer. Instead, it returns an unsigned long handle which encodes actual
> +location of the allocated object.
> +
> +Unlike others ztree works well with objects of various sizes - both highly
> +compressed and poorly compressed including cases where both types are present.
> +
> +Tests
> +=====

I don't think the sections below belong to the Documentation. IMO they are
more suitable to the changelog
> +
> +Test platform
> +-------------
> +
> +Qemu arm64 virtual board with debian 11.
> +
> +Kernel
> +------
> +
> +Linux 5.17-rc6 with ztree and zram over zpool patch. Additionally, counters and
> +time measurements using ktime_get_ns() have been added to ZPOOL API.
> +
> +Tools
> +-----
> +
> +ZRAM disks of size 1000M/1500M/2G, fio 3.25.
> +
> +Test description
> +----------------
> +
> +Run 2 fio scripts in parallel - one with VALUE=50, other with VALUE=70.
> +This emulates page content heterogeneity.
> +
> +fio --bs=4k --randrepeat=1 --randseed=100 --refill_buffers \
> +    --scramble_buffers=1 --buffer_compress_percentage=VALUE \
> +    --direct=1 --loops=1 --numjobs=1 --filename=/dev/zram0 \
> +    --name=seq-write --rw=write --stonewall --name=seq-read \
> +    --rw=read --stonewall --name=seq-readwrite --rw=rw --stonewall \
> +    --name=rand-readwrite --rw=randrw --stonewall
> +
> +Results
> +-------
> +
> +ztree
> +~~~~~
> +
> +* average malloc time (us): 3.8
> +* average free time (us): 3.1
> +* average map time (us): 4.5
> +* average unmap time (us): 1.2
> +* worst zpool op time (us): ~2200
> +* total zpool ops exceeding 1000 us: 29
> +
> +
> +zsmalloc
> +~~~~~~~~
> +
> +* average malloc time (us): 10.3
> +* average free time (us): 6.5
> +* average map time (us): 3.2
> +* average unmap time (us): 1.2
> +* worst zpool op time (us): ~6200
> +* total zpool ops exceeding 1000 us: 1031
> +
> +z3fold
> +~~~~~~
> +
> +* average malloc time (us): 20.8
> +* average free time (us): 29.9
> +* average map time (us): 3.4
> +* average unmap time (us): 1.4
> +* worst zpool op time (us): ~4900
> +* total zpool ops exceeding 1000 us: 100
> +
> +zbud
> +~~~~
> +
> +* average malloc time (us): 8.1
> +* average free time (us): 4.0
> +* average map time (us): 0.3
> +* average unmap time (us): 0.3
> +* worst zpool op time (us): ~9400
> +* total zpool ops exceeding 1000 us: 727

-- 
Sincerely yours,
Mike.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux