Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Wed, 1 Mar 2023 00:08:17 +0000

On Tue, Feb 28, 2023 at 03:22:20PM -0800, Chris Li wrote:
> Hi Matthew,
> 
> On Sun, Feb 19, 2023 at 04:31:33AM +0000, Matthew Wilcox wrote:
> > 
> > I think an overhaul of the swap code is long overdue.  I appreciate
> > you're very much focused on zswap, but there are many other problems.
> > For example, swap does not work on zoned devices.  Swap readahead is
> > generally physical (ie optimised for spinning discs) rather than logical
> > (more appropriate for SSDs).  Swap's management of free space is crude
> > compared to real filesystems.  The way that swap bypasses the filesystem
> > when writing to swap files is awful.  I haven't even started to look at
> 
> Can you expand a bit on that? I assume you want to see the swap file
> behavior more like a normal file system and reuse more of the readpage()
> and writepage() path.

Actually, no, readpage() and writepage() should be reserved for
page cache.  We now have a ->swap_rw(), but it's only implemented by
nfs so far.  Instead of constructing its own BIOs, swap should invoke
->swap_rw for every filesystem.  I suspect we can do a fairly generic
block_swap_rw() for the vast majority of filesystems.

> > what changes need to be made to swap in order to swap out arbitrary-order
> > folios (instead of PMD-sized + PTE-sized).
> 
> When the page fault happens, does the whole folios get swapped in or break
> into smaller pages?

I think the whole folio should be swapped in.  See my proposal for
determining the correct size folio to use here:
https://lore.kernel.org/linux-mm/Y%2FU8bQd15aUO97vS@xxxxxxxxxxxxxxxxxxxx/

Assuming something like that gets implemented, for a large folio to
be swapped out, we've had a selection of page faults on the folio,
followed by a period of no faults.  All of a sudden we have a fault,
so I think we should bring the whole folio back in.  The algorithm I
outline in that email would then take care of breaking down the folio
into smaller folios if it turns out they're not used.