Re: [PATCH v1] mm: zswap: Fix a potential memory leak in zswap_decompress().

Johannes Weiner <hannes@xxxxxxxxxxx> · Wed, 13 Nov 2024 16:30:07 -0500

On Wed, Nov 13, 2024 at 07:12:18PM +0000, Sridhar, Kanchana P wrote:
> I am still thinking moving the mutex_unlock() could help, or at least have
> no downside. The acomp_ctx is per-cpu and it's mutex_lock/unlock
> safeguards the interaction between the decompress operation, the
> sg_*() API calls inside zswap_decompress() and the shared zpool.
> 
> If we release the per-cpu acomp_ctx's mutex lock before the
> zpool_unmap_handle(), is it possible that another cpu could acquire
> it's acomp_ctx's lock and map the same zpool handle (that the earlier
> cpu has yet to unmap or is concurrently unmapping) for a write?
> If this could happen, would it result in undefined state for both
> these zpool ops on different cpu's?

The code is fine as is.

Like you said, acomp_ctx->buffer (the pointer) doesn't change. It
points to whatever was kmalloced in zswap_cpu_comp_prepare(). The
handle points to backend memory. Neither of those addresses can change
under us. There is no confusing them, and they cannot coincide.

The mutex guards the *memory* behind the buffer, so that we don't have
multiple (de)compressors stepping on each others' toes. But it's fine
to drop the mutex once we're done working with the memory. We don't
need the mutex to check whether src holds the acomp buffer address.

That being said, I do think there is a UAF bug in CPU hotplugging.

There is an acomp_ctx for each cpu, but note that this is best effort
parallelism, not a guarantee that we always have the context of the
local CPU. Look closely: we pick the "local" CPU with preemption
enabled, then contend for the mutex. This may well put us to sleep and
get us migrated, so we could be using the context of a CPU we are no
longer running on. This is fine because we hold the mutex - if that
other CPU tries to use the acomp_ctx, it'll wait for us.

However, if we get migrated and vacate the CPU whose context we have
locked, the CPU might get offlined and zswap_cpu_comp_dead() can free
the context underneath us. I think we need to refcount the acomp_ctx.