RE: [PATCH v3 08/13] mm: zswap: acomp_ctx mutex lock/unlock optimizations.

"Sridhar, Kanchana P" <kanchana.p.sridhar@xxxxxxxxx> · Fri, 8 Nov 2024 21:34:30 +0000

Hi Yosry,

> -----Original Message-----
> From: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> Sent: Friday, November 8, 2024 12:14 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> hannes@xxxxxxxxxxx; nphamcs@xxxxxxxxx; chengming.zhou@xxxxxxxxx;
> usamaarif642@xxxxxxxxx; ryan.roberts@xxxxxxx; Huang, Ying
> <ying.huang@xxxxxxxxx>; 21cnbao@xxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx;
> linux-crypto@xxxxxxxxxxxxxxx; herbert@xxxxxxxxxxxxxxxxxxx;
> davem@xxxxxxxxxxxxx; clabbe@xxxxxxxxxxxx; ardb@xxxxxxxxxx;
> ebiggers@xxxxxxxxxx; surenb@xxxxxxxxxx; Accardi, Kristen C
> <kristen.c.accardi@xxxxxxxxx>; zanussi@xxxxxxxxxx; Feghali, Wajdi K
> <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh <vinodh.gopal@xxxxxxxxx>
> Subject: Re: [PATCH v3 08/13] mm: zswap: acomp_ctx mutex lock/unlock
> optimizations.
> 
> On Wed, Nov 6, 2024 at 11:21 AM Kanchana P Sridhar
> <kanchana.p.sridhar@xxxxxxxxx> wrote:
> >
> > This patch implements two changes with respect to the acomp_ctx mutex
> lock:
> 
> The commit subject is misleading, one of these is definitely not an
> optimization.
> 
> Also, if we are doing two unrelated things we should do them in two
> separate commits.

Thanks for the code review comments. I agree, these should be two
separate commits.

> 
> >
> > 1) The mutex lock is not acquired/released in zswap_compress(). Instead,
> >    zswap_store() acquires the mutex lock once before compressing each
> page
> >    in a large folio, and releases the lock once all pages in the folio have
> >    been compressed. This should reduce some compute cycles in case of
> large
> >    folio stores.
> 
> I understand how bouncing the mutex around can regress performance,
> but I expect this to be more due to things like cacheline bouncing and
> allowing reclaim to make meaningful progress before giving up the
> mutex, rather than the actual cycles spent acquiring the mutex.
> 
> Do you have any numbers to support that this is a net improvement? We
> usually base optimizations on data.

Makes sense. I will gather the data to motivate this. In my internal validation,
I have been re-evaluating if this acquire/release once per large folio store
still makes sense, because it runs the risk of introducing long latency paths
within a sleeping mutex. I will quantify the benefits of this (if at all) and update.

> 
> > 2) In zswap_decompress(), the mutex lock is released after the conditional
> >    zpool_unmap_handle() based on "src != acomp_ctx->buffer" rather than
> >    before. This ensures that the value of "src" obtained earlier does not
> >    change. If the mutex lock is released before the comparison of "src" it
> >    is possible that another call to reclaim by the same process could
> >    obtain the mutex lock and over-write the value of "src".
> 
> This seems like a bug fix for 9c500835f279 ("mm: zswap: fix kernel BUG
> in sg_init_one"). That commit changed checking acomp_ctx->is_sleepable
> outside the mutex, which seems to be safe, to checking
> acomp_ctx->buffer.
> 
> If my understanding is correct, this needs to be sent separately as a
> hotfix, with a proper Fixes tag and CC stable. The side effect would
> be that we never unmap the zpool handle and essentially leak the
> memory, right?

Sure, I will send this separately as a hotfix. Yes, the side effect you
describe is correct.

Thanks,
Kanchana

> 
> >
> > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@xxxxxxxxx>
> > ---
> >  mm/zswap.c | 19 +++++++++++++++----
> >  1 file changed, 15 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index f6316b66fb23..3e899fa61445 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -880,6 +880,9 @@ static int zswap_cpu_comp_dead(unsigned int cpu,
> struct hlist_node *node)
> >         return 0;
> >  }
> >
> > +/*
> > + * The acomp_ctx->mutex must be locked/unlocked in the calling
> procedure.
> > + */
> >  static bool zswap_compress(struct page *page, struct zswap_entry *entry,
> >                            struct zswap_pool *pool)
> >  {
> > @@ -895,8 +898,6 @@ static bool zswap_compress(struct page *page,
> struct zswap_entry *entry,
> >
> >         acomp_ctx = raw_cpu_ptr(pool->acomp_ctx);
> >
> > -       mutex_lock(&acomp_ctx->mutex);
> > -
> >         dst = acomp_ctx->buffer;
> >         sg_init_table(&input, 1);
> >         sg_set_page(&input, page, PAGE_SIZE, 0);
> > @@ -949,7 +950,6 @@ static bool zswap_compress(struct page *page,
> struct zswap_entry *entry,
> >         else if (alloc_ret)
> >                 zswap_reject_alloc_fail++;
> >
> > -       mutex_unlock(&acomp_ctx->mutex);
> >         return comp_ret == 0 && alloc_ret == 0;
> >  }
> >
> > @@ -986,10 +986,16 @@ static void zswap_decompress(struct
> zswap_entry *entry, struct folio *folio)
> >         acomp_request_set_params(acomp_ctx->req, &input, &output, entry-
> >length, PAGE_SIZE);
> >         BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx-
> >req), &acomp_ctx->wait));
> >         BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
> > -       mutex_unlock(&acomp_ctx->mutex);
> >
> >         if (src != acomp_ctx->buffer)
> >                 zpool_unmap_handle(zpool, entry->handle);
> > +
> > +       /*
> > +        * It is safer to unlock the mutex after the check for
> > +        * "src != acomp_ctx->buffer" so that the value of "src"
> > +        * does not change.
> > +        */
> 
> This comment is unnecessary, we should only release the lock after we
> are done accessing protected fields.
> 
> > +       mutex_unlock(&acomp_ctx->mutex);
> >  }
> >
> >  /*********************************
> > @@ -1487,6 +1493,7 @@ bool zswap_store(struct folio *folio)
> >  {
> >         long nr_pages = folio_nr_pages(folio);
> >         swp_entry_t swp = folio->swap;
> > +       struct crypto_acomp_ctx *acomp_ctx;
> >         struct obj_cgroup *objcg = NULL;
> >         struct mem_cgroup *memcg = NULL;
> >         struct zswap_pool *pool;
> > @@ -1526,6 +1533,9 @@ bool zswap_store(struct folio *folio)
> >                 mem_cgroup_put(memcg);
> >         }
> >
> > +       acomp_ctx = raw_cpu_ptr(pool->acomp_ctx);
> > +       mutex_lock(&acomp_ctx->mutex);
> > +
> >         for (index = 0; index < nr_pages; ++index) {
> >                 struct page *page = folio_page(folio, index);
> >                 ssize_t bytes;
> > @@ -1547,6 +1557,7 @@ bool zswap_store(struct folio *folio)
> >         ret = true;
> >
> >  put_pool:
> > +       mutex_unlock(&acomp_ctx->mutex);
> >         zswap_pool_put(pool);
> >  put_objcg:
> >         obj_cgroup_put(objcg);
> > --
> > 2.27.0
> >