Re: ANON_LARGE_FOLIOS meeting follow-up & refined proposal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27.09.23 09:23, Ryan Roberts wrote:
On 26/09/2023 19:31, David Hildenbrand wrote:
On 25.09.23 10:51, Ryan Roberts wrote:
On 23/09/2023 01:33, John Hubbard wrote:
On 9/22/23 08:48, Ryan Roberts wrote:
...
I never had any feedback on the below; I'm not sure if that means everyone is
happy or that nobody read it??

One can never really know: zero or more people read it, and of those, no
one hated it enough to send out a quick NAK. So that's a *possible*,
lukewarm endorsement of sorts. Success! :)

You really know how to fill a guy with confidence! ;-)


...

BUT I've had yet another idea on the controls front, which would enable
exposing
this to user space as an extension to transparent_hugepage, while continuing to
support THP as is and also be able to control THP and ALF (anon large folio)

The new ALF / ANON_LARGE_FOLIO naming looks good to me. The grep aspect
is a nice touch.

Well if we go the route of the newest proposal, then I guess the naming is less
important, because it all attaches to transparent_hugepage.

I agree that ALF is better. But having something under "THP", that is not THP
and not accounted as THP ... I don't quite like it (although, before we
discussed that approach in the past, I did like it).

I know we discussed and concluded against putting it under THP in the past, but
I think that decision was driven by not not having any proposal that would allow
us to put it under THP without breaking the expectations of existing (PMD-sized)
THP users, or not being able to control use of the lower orders and PMD-order.

Not only that. It was also because we didn't want to confuse users/devs that assume that THP == PMD-sized.

I'll CC Hugh, I recall he had an opinion on that (I recall some comments about cleanly separating both features towards the user).

Personally I think my latest proposal is a way to solve that problem, and in
that case, I personally think exposing it as an extension to THP is neater:

  - all existing THP controls work as they did before
  - new anon_orders and anon_always_mask files allow opt-in to
    smaller-than-PMD-orders

As "enable" controls anon only (that's correct, right?), maybe these should also simply be called "orders" and "always_mask". shmem could get their own set, like "shmem_enable".

  - All exisitng counters remain unchanged, and continue to count PMD-mapped THP
    only:
       - /proc/meminfo:AnonHugePages
       - /sys/devices/system/node/nodeX/meminfo:AnonHugePages
       - /proc/vmstat:nr_anon_transparent_hugepages
       - /proc/<pid>/smaps[_roolup]:AnonHugePages
       - memory.stat(v1):rss_huge
       - memory.stat(v2):anon_thp
  - New counters introduced to count PTE-mapped THP/large folios:
       - /proc/meminfo:AnonHugePteMap
       - /sys/devices/system/node/nodeX/meminfo:AnonHugePteMap
       - /proc/vmstat:nr_anon_thp_pte
       - /proc/<pid>/smaps[_roolup]:AnonHugePteMap
       - memory.stat(v1):anon_thp_pte
       - memory.stat(v2):anon_thp_pte
  - It's a lot less code (I have an implementation for both approaches)

Admittedly, I haven't spent too much time thinking about the other thp counters
in vmstat yet (e.g. thp_fault_alloc, thp_fault_fallback, etc). Proposal is that
for now, they would continue to be PMD-order only. But I think you could
probably hook those upto the PTE-mapped ones as well, instead of duplicating all
the counters.

As Kiril mentioned, PTE-mapped THP is already a thing, so this approach just
formalises it.

Not quite. PTE-mapped THP were just a side-effect of the transparency handling. We never allocated and populated PTE-mapped PMD-sized THP on allocation. So I don't immediately see the connection between both for this case.

Would you account a PTE-mapped (PMD-sized) THP as anon_thp or anon_thp_pte? What if it's mapped via PTEs and PMDs? I don't see how that formalises that case for the existing PMD-szed THP.


I also think the "huge" means PMD-size argument is a bit weak, given that THP
supports PUD-size today for file mappings, and in the context of hugetlb, huge
can mean contpte, pmd, contpmd, pud, etc.

I made similar statements in the past but was convinced otherwise :)


I'll have the patch set ready to post by Friday. How about I post it, then we
can continue the conversation in the context of the actual code? If the
concensus is that this is not the way to do it, then I'll post the large_folio
version instead?

No strong opinion from my side, I considered a "fresh start" without the THP implication/thermonology after all the previous discussions cleaner [which I think was one of the outcomes of the previous discussions].

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux