Re: [PATCH v3 1/2] mm: store zero pages to be swapped out in a bitmap

Yosry Ahmed <yosryahmed@xxxxxxxxxx> · Tue, 11 Jun 2024 11:46:56 -0700

[..]
> > @@ -1336,6 +1347,7 @@ static void swap_entry_free(struct swap_info_struct *p, swp_entry_t entry)
> >         count = p->swap_map[offset];
> >         VM_BUG_ON(count != SWAP_HAS_CACHE);
> >         p->swap_map[offset] = 0;
> > +       clear_bit(offset, p->zeromap);
>
> Hmm so clear_bit() is done at the swap_entry_free() point. I wonder if
> we can have a problem, where:
>
> 1. The swap entry has its zeromap bit set, and is freed to the swap
> slot cache (free_swap_slot() in mm/swap_slots.c). For instance, it is
> reclaimed from the swap cache, and all the processes referring to it
> are terminated, which decrements the swap count to 0 (swap_free() ->
> __swap_entry_free() -> free_swap_slots())
>
> 2. The swap slot is then re-used in swap space allocation
> (add_to_swap()) - its zeromap bit is never cleared.

I do not think this can happen before swap_entry_free() is called.
Note that when a swap entry is freed to the swap slot cache in
free_swap_slot(), it is added to cache->slots_ret, not cache->slots.
The former are swap entries cached to be later freed using
swap_entry_free().

>
> 3. swap_writepage() writes that non-zero page to swap
>
> 4. swap_read_folio() checks the bitmap, sees that the zeromap bit for
> the entry is set, so populates a zero page for it.
>
> zswap in the past has to carefully invalidate these leftover entries
> quite carefully. Chengming then move the invalidation point to
> free_swap_slot(), massively simplifying the logic.

I think the main benefit of moving the invalidation point was avoiding
leaving the compressed page in zswap until swap_entry_free() is
called, which will happen only when the swap slot caches are drained.

>
> I wonder if we need to do the same here?