Re: [PATCH v6 0/9] variable-order, large folios for anonymous memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 31.10.23 12:55, Ryan Roberts wrote:
On 31/10/2023 11:50, Ryan Roberts wrote:
On 06/10/2023 21:06, David Hildenbrand wrote:
[...]

Change 2: sysfs interface.

If we call it THP, it shall go under "/sys/kernel/mm/transparent_hugepage/", I
agree.

What we expose there and how, is TBD. Again, not a friend of "orders" and
bitmaps at all. We can do better if we want to go down that path.

Maybe we should take a look at hugetlb, and how they added support for multiple
sizes. What *might* make sense could be (depending on which values we actually
support!)


/sys/kernel/mm/transparent_hugepage/hugepages-64kB/
/sys/kernel/mm/transparent_hugepage/hugepages-128kB/
/sys/kernel/mm/transparent_hugepage/hugepages-256kB/
/sys/kernel/mm/transparent_hugepage/hugepages-512kB/
/sys/kernel/mm/transparent_hugepage/hugepages-1024kB/
/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/

Each one would contain an "enabled" and "defrag" file. We want something minimal
first? Start with the "enabled" option.


enabled: always [global] madvise never

Initially, we would set it for PMD-sized THP to "global" and for everything else
to "never".

Hi David,

I've just started coding this, and it occurs to me that I might need a small
clarification here; the existing global "enabled" control is used to drive
decisions for both anonymous memory and (non-shmem) file-backed memory. But the
proposed new per-size "enabled" is implicitly only controlling anon memory (for
now).

1) Is this potentially confusing for the user? Should we rename the per-size
controls to "anon_enabled"? Or is it preferable to jsut keep it vague for now so
we can reuse the same control for file-backed memory in future?

2) The global control will continue to drive the file-backed memory decision
(for now), even when hugepages-2048kB/enabled != "global"; agreed?

Thanks,
Ryan


Also, an implementation question:

hugepage_vma_check() doesn't currently care whether enabled="never" for DAX VMAs
(although it does honour MADV_NOHUGEPAGE and the prctl); It will return true
regardless. Is that by design? It couldn't fathom any reasoning from the commit log:

The whole DAX "hugepage" and THP mixup is just plain confusing. We're simply using PUD/PMD mappings of DAX memory, and PMD/PTE- remap when required (VMA split I assume, COW).

It doesn't result in any memory waste, so who really cares how it's mapped? Apparently we want individual processes to just disable PMD/PUD mappings of DAX using the prctl and madvise. Maybe there are good reasons.

Looks like a design decision, probably some legacy leftovers.

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux