RE: [PATCH v1] mm: zswap: Fix a potential memory leak in zswap_decompress().

"Sridhar, Kanchana P" <kanchana.p.sridhar@xxxxxxxxx> · Wed, 13 Nov 2024 22:13:32 +0000

> -----Original Message-----
> From: Johannes Weiner <hannes@xxxxxxxxxxx>
> Sent: Wednesday, November 13, 2024 1:30 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx>
> Cc: Yosry Ahmed <yosryahmed@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
> linux-mm@xxxxxxxxx; nphamcs@xxxxxxxxx; chengming.zhou@xxxxxxxxx;
> usamaarif642@xxxxxxxxx; ryan.roberts@xxxxxxx; Huang, Ying
> <ying.huang@xxxxxxxxx>; 21cnbao@xxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx;
> Feghali, Wajdi K <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh
> <vinodh.gopal@xxxxxxxxx>
> Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak in
> zswap_decompress().
> 
> On Wed, Nov 13, 2024 at 07:12:18PM +0000, Sridhar, Kanchana P wrote:
> > I am still thinking moving the mutex_unlock() could help, or at least have
> > no downside. The acomp_ctx is per-cpu and it's mutex_lock/unlock
> > safeguards the interaction between the decompress operation, the
> > sg_*() API calls inside zswap_decompress() and the shared zpool.
> >
> > If we release the per-cpu acomp_ctx's mutex lock before the
> > zpool_unmap_handle(), is it possible that another cpu could acquire
> > it's acomp_ctx's lock and map the same zpool handle (that the earlier
> > cpu has yet to unmap or is concurrently unmapping) for a write?
> > If this could happen, would it result in undefined state for both
> > these zpool ops on different cpu's?
> 
> The code is fine as is.
> 
> Like you said, acomp_ctx->buffer (the pointer) doesn't change. It
> points to whatever was kmalloced in zswap_cpu_comp_prepare(). The
> handle points to backend memory. Neither of those addresses can change
> under us. There is no confusing them, and they cannot coincide.
> 
> The mutex guards the *memory* behind the buffer, so that we don't have
> multiple (de)compressors stepping on each others' toes. But it's fine
> to drop the mutex once we're done working with the memory. We don't
> need the mutex to check whether src holds the acomp buffer address.

Thanks Johannes, for these insights. I was thinking of the following
in zswap_decompress() as creating a non-preemptible context because
of the call to raw_cpu_ptr() at the start; with this context extending
until the mutex_unlock():

	acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx);
	mutex_lock(&acomp_ctx->mutex);

	[...]

	mutex_unlock(&acomp_ctx->mutex);

	if (src != acomp_ctx->buffer)
		zpool_unmap_handle(zpool, entry->handle);

Based on this understanding, I was a bit worried about the
"acomp_ctx->buffer" in the conditional that gates the
zpool_unmap_handle() not being the same acomp_ctx as the one
at the beginning. I may have been confusing myself, since the acomp_ctx
is not re-evaluated before the conditional, just reused from the
start. My apologies to you and Yosry!

> 
> That being said, I do think there is a UAF bug in CPU hotplugging.
> 
> There is an acomp_ctx for each cpu, but note that this is best effort
> parallelism, not a guarantee that we always have the context of the
> local CPU. Look closely: we pick the "local" CPU with preemption
> enabled, then contend for the mutex. This may well put us to sleep and
> get us migrated, so we could be using the context of a CPU we are no
> longer running on. This is fine because we hold the mutex - if that
> other CPU tries to use the acomp_ctx, it'll wait for us.
> 
> However, if we get migrated and vacate the CPU whose context we have
> locked, the CPU might get offlined and zswap_cpu_comp_dead() can free
> the context underneath us. I think we need to refcount the acomp_ctx.

I see. Wouldn't it then seem to make the code more fail-safe to not allow
the migration to happen until after the check for (src != acomp_ctx->buffer), by
moving the mutex_unlock() after this check? Or, use a boolean to determine
if the unmap_handle needs to be done as Yosry suggested?

Thanks,
Kanchana