Re: [LSF/MM/BPF TOPIC] Large folios, swap and fscache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andreas,

On Thu, Feb 22, 2024 at 7:03 PM Andreas Dilger <adilger@xxxxxxxxx> wrote:
>
> On Feb 22, 2024, at 3:45 PM, Chris Li <chrisl@xxxxxxxxxx> wrote:
> >
> > Hi David,
> >
> > On Fri, Feb 2, 2024 at 1:10 AM David Howells <dhowells@xxxxxxxxxx> wrote:
> >>
> >> Hi,
> >>
> >> The topic came up in a recent discussion about how to deal with large folios
> >> when it comes to swap as a swap device is normally considered a simple array
> >> of PAGE_SIZE-sized elements that can be indexed by a single integer.
> >
> > Sorry for being late for the party. I think I was the one that brought
> > this topic up in the online discussion with Will and You. Let me know
> > if you are referring to a different discussion.
> >
> >>
> >> With the advent of large folios, however, we might need to change this in
> >> order to be better able to swap out a compound page efficiently.  Swap
> >> fragmentation raises its head, as does the need to potentially save multiple
> >> indices per folio.  Does swap need to grow more filesystem features?
> >
> > Yes, with a large folio, it is harder to allocate continuous swap
> > entries where 4K swap entries are allocated and free all the time. The
> > fragmentation will likely make the swap file have very little
> > continuous swap entries.
>
> One option would be to reuse the multi-block allocator (mballoc) from
> ext4, which has quite efficient power-of-two buddy allocation.  That
> would naturally aggregate contiguous pages as they are freed.  Since
> the swap partition is not containing anything useful across a remount
> there is no need to save allocation bitmaps persistently.

That is a very interesting idea. I saw two ways to solve this problem,
buddy allocation system is one of them. The buddy allocation system
can keep the assumption that swap entries will be contiguous within
the same folio. The buddy system also has its own limits due to
external fragmentations. For one there is no easy way to relocate the
swap entry to other locations. We don't have the rmap for swap
entries. That makes the swap entries hard to compact. I do expect the
buddy allocator can help reduce the fragmentation greatly.

The other way is just to have an indirection for mapping a folio's
swap entry to discontiguous swap entries. It will break more
assumptions of the current code about contiguous swap entries.

If we can reuse the ext4 mballoc for swap entries, that would be
great. I will take a look at that and report back.

Thanks for the great suggestion.

Chris





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux