Re: [LSF/MM/BPF TOPIC] Swap Abstraction "the pony"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 1, 2024 at 1:53 AM Nhat Pham <nphamcs@xxxxxxxxx> wrote:
> > At the swap entry level, here is the list of existing swap entry usage:
> >
> > * Swap entry allocation and free. Each swap entry needs to be
> > associated with a location of the disk space in the swapfile. (offset
> > of swap entry).
> > * Each swap entry needs to track the map count of the entry. (swap_map)
> > * Each swap entry needs to be able to find the associated memory
> > cgroup. (swap_cgroup_ctrl->map)
> > * Swap cache. Lookup folio/shadow from swap entry
> > * Swap page writes through a swapfile in a file system other than a
> > block device. (swap_extent)
> > * Shadow entry. (store in swap cache)
>
> IMHO, one thing this new abstraction should support is seamless
> transfer/migration of pages from one backend to another (perhaps from
> high to low priority backends, i.e writeback).

Yes, that is the next step. I am just covering the existing usage here.
What you describe is what I call "the swap tiers". I considered that
topic but did not submit it this year. The current swap back end is
too en-tangled, (lack of a better word). It is very hard to add more
complex data structures in the existing swap back end. That is why I
want to untangle it a bit before attacking the next level stuff.

>
> I think this will require some careful redesigns. The closest thing we
> have right now is zswap -> backing swapfile. But it is currently
> handled in a rather peculiar manner - the underlying swap slot has
> already been reserved for the zswap entry. But there's a couple of
> problems with this:
>
> a) This is wasteful. We're essentially having the same piece of data
> occupying spaces in two levels in the hierarchies.

Can you elerate? If you have a ghost swap file, the zswap will not
store data in two swap devices.
The price to pay is that you need to allocate another swap slot on the
real backing swap file. That is the same if you move SSD data to a
hard disk. You need to allocate a new swap entry on the destination
device.

> b) How do we generalize to a multi-tier hierarchy?

If zswap runs on a ghost swap file, flushing from zswap to another
real swap file would be very similar to flushing from one SSD to
another. That is the more generalized case. Zswap sharing swap slot
with the backing swapfile is a very special case.

> c) This is a bit too backend-specific. It'd be nice if we can make
> this as backend-agnostic as possible (if possible).

Totally agree, that is one of my motivations for the "swap.tiers" idea.

>
> Motivation: I'm currently working/thinking about decoupling zswap and
> swap, and this is one of the more challenging aspects (as I can't seem
> to find a precedent in the swap world for inter-swap backends pages
> migration), and especially with respect to concurrent loads (and
> swapcache interactions).

It will be very messy if you try that in the current swap back end.

Chris

>
> I don't have good answers/designs quite yet - just raising some
> questions/concerns :)
>





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux