> -----Original Message----- > From: Johannes Weiner <hannes@xxxxxxxxxxx> > Sent: Wednesday, November 13, 2024 1:30 PM > To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx> > Cc: Yosry Ahmed <yosryahmed@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; > linux-mm@xxxxxxxxx; nphamcs@xxxxxxxxx; chengming.zhou@xxxxxxxxx; > usamaarif642@xxxxxxxxx; ryan.roberts@xxxxxxx; Huang, Ying > <ying.huang@xxxxxxxxx>; 21cnbao@xxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; > Feghali, Wajdi K <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh > <vinodh.gopal@xxxxxxxxx> > Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak in > zswap_decompress(). > > On Wed, Nov 13, 2024 at 07:12:18PM +0000, Sridhar, Kanchana P wrote: > > I am still thinking moving the mutex_unlock() could help, or at least have > > no downside. The acomp_ctx is per-cpu and it's mutex_lock/unlock > > safeguards the interaction between the decompress operation, the > > sg_*() API calls inside zswap_decompress() and the shared zpool. > > > > If we release the per-cpu acomp_ctx's mutex lock before the > > zpool_unmap_handle(), is it possible that another cpu could acquire > > it's acomp_ctx's lock and map the same zpool handle (that the earlier > > cpu has yet to unmap or is concurrently unmapping) for a write? > > If this could happen, would it result in undefined state for both > > these zpool ops on different cpu's? > > The code is fine as is. > > Like you said, acomp_ctx->buffer (the pointer) doesn't change. It > points to whatever was kmalloced in zswap_cpu_comp_prepare(). The > handle points to backend memory. Neither of those addresses can change > under us. There is no confusing them, and they cannot coincide. > > The mutex guards the *memory* behind the buffer, so that we don't have > multiple (de)compressors stepping on each others' toes. But it's fine > to drop the mutex once we're done working with the memory. We don't > need the mutex to check whether src holds the acomp buffer address. Thanks Johannes, for these insights. I was thinking of the following in zswap_decompress() as creating a non-preemptible context because of the call to raw_cpu_ptr() at the start; with this context extending until the mutex_unlock(): acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); mutex_lock(&acomp_ctx->mutex); [...] mutex_unlock(&acomp_ctx->mutex); if (src != acomp_ctx->buffer) zpool_unmap_handle(zpool, entry->handle); Based on this understanding, I was a bit worried about the "acomp_ctx->buffer" in the conditional that gates the zpool_unmap_handle() not being the same acomp_ctx as the one at the beginning. I may have been confusing myself, since the acomp_ctx is not re-evaluated before the conditional, just reused from the start. My apologies to you and Yosry! > > That being said, I do think there is a UAF bug in CPU hotplugging. > > There is an acomp_ctx for each cpu, but note that this is best effort > parallelism, not a guarantee that we always have the context of the > local CPU. Look closely: we pick the "local" CPU with preemption > enabled, then contend for the mutex. This may well put us to sleep and > get us migrated, so we could be using the context of a CPU we are no > longer running on. This is fine because we hold the mutex - if that > other CPU tries to use the acomp_ctx, it'll wait for us. > > However, if we get migrated and vacate the CPU whose context we have > locked, the CPU might get offlined and zswap_cpu_comp_dead() can free > the context underneath us. I think we need to refcount the acomp_ctx. I see. Wouldn't it then seem to make the code more fail-safe to not allow the migration to happen until after the check for (src != acomp_ctx->buffer), by moving the mutex_unlock() after this check? Or, use a boolean to determine if the unmap_handle needs to be done as Yosry suggested? Thanks, Kanchana