Re: [PATCHv2] zsmalloc: allow only one active pool compaction context

Yosry Ahmed <yosryahmed@xxxxxxxxxx> · Mon, 17 Apr 2023 11:32:22 -0700

On Mon, Apr 17, 2023 at 6:54 AM Sergey Senozhatsky
<senozhatsky@xxxxxxxxxxxx> wrote:
>
> zsmalloc pool can be compacted concurrently by many contexts,
> e.g.
>
>  cc1 handle_mm_fault()
>       do_anonymous_page()
>        __alloc_pages_slowpath()
>         try_to_free_pages()
>          do_try_to_free_pages(
>           lru_gen_shrink_node()
>            shrink_slab()
>             do_shrink_slab()
>              zs_shrinker_scan()
>               zs_compact()
>
> This creates unnecessary contention as all those processes
> compete for access to the same classes. A single compaction
> process is enough. Moreover contention that is created by
> multiple compaction processes impact other zsmalloc functions,
> e.g. zs_malloc(), since zsmalloc uses "global" pool->lock to
> synchronize access to pool.
>
> Introduce pool compaction mutex and permit only one compaction
> context at a time. This reduces overall pool->lock contention.
>
> /proc/lock-stat after make -j$((`nproc`+1)) linux kernel for
> &pool->lock#3:
>
>                 Base           Patched
> ------------------------------------------
> con-bounces     2035730        1540066
> contentions     2343871        1774348
> waittime-min    0.10           0.10
> waittime-max    4004216.24     2745.22
> waittime-total  101334168.29   67865414.91
> waittime-avg    43.23          38.25
> acq-bounces     2895765        2186745
> acquisitions    6247686        5136943
> holdtime-min    0.07           0.07
> holdtime-max    2605507.97     482439.16
> holdtime-total  9998599.59     5107151.01
> holdtime-avg    1.60           0.99

The numbers seem to be better when using an atomic vs. a mutex, is
this just noise or significant difference? (I am not familiar with
lock-stat).

>
> Test run time:
> Base
> 2775.15user 1709.13system 2:13.82elapsed 3350%CPU
>
> Patched
> 2608.25user 1439.03system 2:03.63elapsed 3273%CPU
>
> Signed-off-by: Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx>

FWIW,
Reviewed-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>

> ---
>  mm/zsmalloc.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index cc81dfba05a0..dfec2fc6a30f 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -264,6 +264,7 @@ struct zs_pool {
>         struct work_struct free_work;
>  #endif
>         spinlock_t lock;
> +       atomic_t compaction_in_progress;
>  };
>
>  struct zspage {
> @@ -2274,6 +2275,9 @@ unsigned long zs_compact(struct zs_pool *pool)
>         struct size_class *class;
>         unsigned long pages_freed = 0;
>
> +       if (atomic_xchg(&pool->compaction_in_progress, 1))
> +               return 0;
> +
>         for (i = ZS_SIZE_CLASSES - 1; i >= 0; i--) {
>                 class = pool->size_class[i];
>                 if (class->index != i)
> @@ -2281,6 +2285,7 @@ unsigned long zs_compact(struct zs_pool *pool)
>                 pages_freed += __zs_compact(pool, class);
>         }
>         atomic_long_add(pages_freed, &pool->stats.pages_compacted);
> +       atomic_set(&pool->compaction_in_progress, 0);
>
>         return pages_freed;
>  }
> @@ -2388,6 +2393,7 @@ struct zs_pool *zs_create_pool(const char *name)
>
>         init_deferred_free(pool);
>         spin_lock_init(&pool->lock);
> +       atomic_set(&pool->compaction_in_progress, 0);
>
>         pool->name = kstrdup(name, GFP_KERNEL);
>         if (!pool->name)
> --
> 2.40.0.634.g4ca3ef3211-goog
>