On Tue, Feb 06, 2024 at 10:23:33AM +0800, Chengming Zhou wrote: > On 2024/2/6 06:55, Yosry Ahmed wrote: > > On Sun, Feb 04, 2024 at 08:34:11AM +0000, chengming.zhou@xxxxxxxxx wrote: > >> From: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> > >> > >> We may encounter duplicate entry in the zswap_store(): > >> > >> 1. swap slot that freed to per-cpu swap cache, doesn't invalidate > >> the zswap entry, then got reused. This has been fixed. > >> > >> 2. !exclusive load mode, swapin folio will leave its zswap entry > >> on the tree, then swapout again. This has been removed. > >> > >> 3. one folio can be dirtied again after zswap_store(), so need to > >> zswap_store() again. This should be handled correctly. > >> > >> So we must invalidate the old duplicate entry before insert the > >> new one, which actually doesn't have to be done at the beginning > >> of zswap_store(). And this is a normal situation, we shouldn't > >> WARN_ON(1) in this case, so delete it. (The WARN_ON(1) seems want > >> to detect swap entry UAF problem? But not very necessary here.) > >> > >> The good point is that we don't need to lock tree twice in the > >> store success path. > >> > >> Note we still need to invalidate the old duplicate entry in the > >> store failure path, otherwise the new data in swapfile could be > >> overwrite by the old data in zswap pool when lru writeback. > > > > I think this may have been introduced by 42c06a0e8ebe ("mm: kill > > frontswap"). Frontswap used to check if the page was present in > > frontswap and invalidate it before calling into zswap, so it would > > invalidate a previously stored page when it is dirtied and swapped out > > again, even if zswap is disabled. > > > > Johannes, does this sound correct to you? If yes, I think we need a > > proper Fixes tag and a stable backport as this may cause data > > corruption. > > I haven't looked into that commit. If this is true, will add: > > Fixes: 42c06a0e8ebe ("mm: kill frontswap") You're right, this was introduced by the frontswap removal. The Fixes tag is appropriate, as well as CC: stable@xxxxxxxxxxxxxxx. > >> We have to do this even when !zswap_enabled since zswap can be > >> disabled anytime. If the folio store success before, then got > >> dirtied again but zswap disabled, we won't invalidate the old > >> duplicate entry in the zswap_store(). So later lru writeback > >> may overwrite the new data in swapfile. > >> > >> This fix is not good, since we have to grab lock to check everytime > >> even when zswap is disabled, but it's simple. > > > > Frontswap had a bitmap that we can query locklessly to find out if there > > is an outdated stored page. I think we can overcome this with the > > xarray, we can do a lockless lookup first, and only take the lock if > > there is an outdated entry to remove. > > Yes, agree! We can lockless lookup once xarray lands in. > > > Meanwhile I am not sure if acquiring the lock on every swapout even with > > zswap disabled is acceptable, but I think it's the simplest fix for now, > > unless we revive the bitmap. > > Yeah, it's simple. I think bitmap is not needed if we will use xarray. I don't think the lock is a dealbreaker in the short term. We also take it in the load and invalidate paths even if zswap is disabled, to maintain coherency during intermittent enabling/disabling. It hasn't been an issue in production at least. > >> Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> > > > > For now, I think an if condition is clearer: > > > > if (zswap_rb_insert(&tree->rbroot, entry, &dupentry) == -EEXIST) { > > zswap_invalidate_entry(tree, dupentry); > > /* Must succeed, we just removed the dup under the lock */ > > WARN_ON(zswap_rb_insert(&tree->rbroot, entry, &dupentry)); > > } > > This is clearer, will change to this version. Agreed! With that: Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>