On 26/09/2023 19:31, David Hildenbrand wrote: > On 25.09.23 10:51, Ryan Roberts wrote: >> On 23/09/2023 01:33, John Hubbard wrote: >>> On 9/22/23 08:48, Ryan Roberts wrote: >>> ... >>>> I never had any feedback on the below; I'm not sure if that means everyone is >>>> happy or that nobody read it?? >>> >>> One can never really know: zero or more people read it, and of those, no >>> one hated it enough to send out a quick NAK. So that's a *possible*, >>> lukewarm endorsement of sorts. Success! :) >> >> You really know how to fill a guy with confidence! ;-) >> >>> >>> ... >>> >>>> BUT I've had yet another idea on the controls front, which would enable >>>> exposing >>>> this to user space as an extension to transparent_hugepage, while continuing to >>>> support THP as is and also be able to control THP and ALF (anon large folio) >>> >>> The new ALF / ANON_LARGE_FOLIO naming looks good to me. The grep aspect >>> is a nice touch. >> >> Well if we go the route of the newest proposal, then I guess the naming is less >> important, because it all attaches to transparent_hugepage. > > I agree that ALF is better. But having something under "THP", that is not THP > and not accounted as THP ... I don't quite like it (although, before we > discussed that approach in the past, I did like it). I know we discussed and concluded against putting it under THP in the past, but I think that decision was driven by not not having any proposal that would allow us to put it under THP without breaking the expectations of existing (PMD-sized) THP users, or not being able to control use of the lower orders and PMD-order. Personally I think my latest proposal is a way to solve that problem, and in that case, I personally think exposing it as an extension to THP is neater: - all existing THP controls work as they did before - new anon_orders and anon_always_mask files allow opt-in to smaller-than-PMD-orders - All exisitng counters remain unchanged, and continue to count PMD-mapped THP only: - /proc/meminfo:AnonHugePages - /sys/devices/system/node/nodeX/meminfo:AnonHugePages - /proc/vmstat:nr_anon_transparent_hugepages - /proc/<pid>/smaps[_roolup]:AnonHugePages - memory.stat(v1):rss_huge - memory.stat(v2):anon_thp - New counters introduced to count PTE-mapped THP/large folios: - /proc/meminfo:AnonHugePteMap - /sys/devices/system/node/nodeX/meminfo:AnonHugePteMap - /proc/vmstat:nr_anon_thp_pte - /proc/<pid>/smaps[_roolup]:AnonHugePteMap - memory.stat(v1):anon_thp_pte - memory.stat(v2):anon_thp_pte - It's a lot less code (I have an implementation for both approaches) Admittedly, I haven't spent too much time thinking about the other thp counters in vmstat yet (e.g. thp_fault_alloc, thp_fault_fallback, etc). Proposal is that for now, they would continue to be PMD-order only. But I think you could probably hook those upto the PTE-mapped ones as well, instead of duplicating all the counters. As Kiril mentioned, PTE-mapped THP is already a thing, so this approach just formalises it. I also think the "huge" means PMD-size argument is a bit weak, given that THP supports PUD-size today for file mappings, and in the context of hugetlb, huge can mean contpte, pmd, contpmd, pud, etc. I'll have the patch set ready to post by Friday. How about I post it, then we can continue the conversation in the context of the actual code? If the concensus is that this is not the way to do it, then I'll post the large_folio version instead? Thanks, Ryan