Re: [LSF/MM/BPF TOPIC] Integrate Swap Cache, Swap Maps with Swap Allocator

Johannes Weiner <hannes@xxxxxxxxxxx> · Tue, 4 Feb 2025 14:09:04 -0500

On Wed, Feb 05, 2025 at 02:38:39AM +0800, Kairui Song wrote:
> On Wed, Feb 5, 2025 at 2:11 AM Yosry Ahmed <yosry.ahmed@xxxxxxxxx> wrote:
> > However, what we should *not* do is have these clusters be tied to the
> > disk swap space with the ability to redirect some entries to use
> > someting like zswap. This does not fix the problem Johannes is
> > describing.
> 
> Yes, a virtual swap file can have its own swap space, which is indexed
> by the cache / table, and reuse all the logic. As long as we don't
> dramatically change the kernel swapout path, adding a folio to
> swapcache seems a very reasonable way to avoid redundant IO, and
> synchronize it upon swapin/swapout, and reusing a lot of
> infrastructure, even if that's a virtual file. For example a current
> busy loop issue can be just fixed by leveraging the folio lock:
> https://lore.kernel.org/lkml/CAMgjq7D5qoFEK9Omvd5_Zqs6M+TEoG03+2i_mhuP5CQPSOPrmQ@xxxxxxxxxxxxxx/
> 
> The virtual file/space can be decoupled from the lower device. But the
> virtual file/space's table entry can point to an underlying physical
> SWAP device or some meta struct.

It's a bit unclear to me still which level will use the struct
swap_cluster_info in the layered scenario.

Would it be the virtual address space, where ->table has tagged
pointers to resolve to swapcache/zeromap/zswap/swapfile?

Or would it be the swapfile space, where ->table resolves to disk
slots?

Or are you proposing to use the same struct on both levels, with
->table catering to different needs?

Keep in mind, in the virtualized case, it's the top layer that would
have to keep track of the page table count, the swapcache pointer and
likely the memcg linkage. That also means the physical layer could
likely be reduced to a single bit per entry - used or free.

I suppose void *table could also point to such a bitmap? But not sure
about the other members that would become redundant/unused.