Re: Prerequisites for Large Anon Folios

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 30.08.23 12:44, Ryan Roberts wrote:
Hi All,


Hi Ryan,

I'll be back from vacation next Wednesday.

Note that I asked David R. to have large anon folios as topic for the next bi-weekly mm meeting.

There, we should discuss things like
* naming
* accounting (/proc/meminfo)
* required toggles (especially, to ways to disable it, as we want to
  keep toggles minimal)

David R. raised that there are certainly workloads where the additional memory overhead is usually not acceptable. So it will be valuable to get input from others.


I want to get serious about getting large anon folios merged. To do that, there
are a number of outstanding prerequistes. I'm hoping the respective owners may
be able to provide an update on progress?

I shared some details in the last meeting when you were on vacation :)

High level update below.

[...]


- item:
     shared vs exclusive mappings

   priority:
     prerequisite

   description: >-
     New mechanism to allow us to easily determine precisely whether a given
     folio is mapped exclusively or shared between multiple processes. Required
     for (from David H):

     (1) Detecting shared folios, to not mess with them while they are shared.
     MADV_PAGEOUT, user-triggered page migration, NUMA hinting, khugepaged ...
     replace cases where folio_estimated_sharers() == 1 would currently be the
     best we can do (and in some cases, page_mapcount() == 1).

     (2) COW improvements for PTE-mapped large anon folios after fork(). Before
     fork(), PageAnonExclusive would have been reliable, after fork() it's not.

     For (1), "MADV_PAGEOUT" maps to the "madvise" item captured in this list. I
     *think* "NUMA hinting" maps to "numa balancing" (but need confirmation!).
     "user-triggered page migration" and "khugepaged" not yet captured (would
     appreciate someone fleshing it out). I previously understood migration to be
     working for large folios - is "user-triggered page migration" some specific
     aspect that does not work?

     For (2), this relates to Large Anon Folio enhancements which I plan to
     tackle after we get the basic series merged.

   links:
     - 'email thread: Mapcount games: "exclusive mapped" vs. "mapped shared"'

   location:
     - shrink_folio_list()

   assignee:
     David Hildenbrand <david@xxxxxxxxxx>

Any comment on this David? I think the last comment I saw was that you were
planning to start an implementation a couple of weeks back? Did that get anywhere?

The math should be solid at this point and I had a simple prototype running -- including fairly clean COW reuse handling.

I started cleaning it all up before my vacation. I'll first need the total mapcount (which I sent), and might have to implement rmap patching during THP split (easy), but I first have to do more measurements.

Willies patches to free up space in the first tail page will be required. In addition, my patches to free up ->private in tail pages for THP_SWAP. Both things on their way upstream.

Based on that, I need a bit spinlock to protect the total mapcount+tracking data. There are things to measure (contention) and optimize (why even care about tracking shared vs. exclusive if it's pretty guaranteed to always be shared -- for example, shared libraries).

So it looks reasonable at this point, but I'll have to look into possible contentions and optimizations once I have the basics implemented cleanly.

It's a shame we cannot get the subpage mapcount out of the way immediately, then it wouldn't be "additional tracking" but "different tracking" :)

Once back from vacation, I'm planning on prioritizing this. Shouldn't take ages to get it cleaned up. Measurements and optimizations might take a bit longer.

[...]



   assignee:
     Yin, Fengwei <fengwei.yin@xxxxxxxxx>

As I understand it: initial solution based on folio_estimated_sharers() has gone
into v6.5. Have a dependecy on David's precise shared vs exclusive work for an

shared vs. exclusive in place would replace folio_estimated_sharers() users and most sub-page mapcount users.

improved solution. And I think you mentioned you are planning to do a change
that avoids splitting a large folio if it is entirely covered by the range?

[..]

- item:
     numa balancing

   priority:
     prerequisite

   description: >-
     Large, pte-mapped folios are ignored by numa-balancing code. Commit comment
     (e81c480): "We're going to have THP mapped with PTEs. It will confuse
     numabalancing. Let's skip them for now." Likely depends on "shared vs
     exclusive mappings". >>
   links: []

   location:
     - do_numa_page()

   assignee:
     <none>


Vaguely sounded like David might be planning to tackle this as part of his work
on "shared vs exclusive mappings" ("NUMA hinting"??). David?

It should be easy to handle it based on that. Similarly, khugepaged IIRC.

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux