On 01/07/2024 19:20, Yang Shi wrote: > On Mon, Jul 1, 2024 at 3:23 AM David Hildenbrand <david@xxxxxxxxxx> wrote: >> >> On 01.07.24 12:16, Ryan Roberts wrote: >>> On 01/07/2024 10:17, David Hildenbrand wrote: >>>> On 01.07.24 11:14, Ryan Roberts wrote: >>>>> On 01/07/2024 09:57, David Hildenbrand wrote: >>>>>> On 01.07.24 10:50, Ryan Roberts wrote: >>>>>>> On 01/07/2024 09:48, David Hildenbrand wrote: >>>>>>>> On 01.07.24 10:40, Ryan Roberts wrote: >>>>>>>>> On 01/07/2024 09:33, Baolin Wang wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2024/7/1 15:55, Ryan Roberts wrote: >>>>>>>>>>> On 28/06/2024 11:49, Bang Li wrote: >>>>>>>>>>>> After the commit 7fb1b252afb5 ("mm: shmem: add mTHP support for >>>>>>>>>>>> anonymous shmem"), we can configure different policies through >>>>>>>>>>>> the multi-size THP sysfs interface for anonymous shmem. But >>>>>>>>>>>> currently "THPeligible" indicates only whether the mapping is >>>>>>>>>>>> eligible for allocating THP-pages as well as the THP is PMD >>>>>>>>>>>> mappable or not for anonymous shmem, we need to support semantics >>>>>>>>>>>> for mTHP with anonymous shmem similar to those for mTHP with >>>>>>>>>>>> anonymous memory. >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Bang Li <libang.li@xxxxxxxxxxxx> >>>>>>>>>>>> --- >>>>>>>>>>>> fs/proc/task_mmu.c | 10 +++++++--- >>>>>>>>>>>> include/linux/huge_mm.h | 11 +++++++++++ >>>>>>>>>>>> mm/shmem.c | 9 +-------- >>>>>>>>>>>> 3 files changed, 19 insertions(+), 11 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >>>>>>>>>>>> index 93fb2c61b154..09b5db356886 100644 >>>>>>>>>>>> --- a/fs/proc/task_mmu.c >>>>>>>>>>>> +++ b/fs/proc/task_mmu.c >>>>>>>>>>>> @@ -870,6 +870,7 @@ static int show_smap(struct seq_file *m, void *v) >>>>>>>>>>>> { >>>>>>>>>>>> struct vm_area_struct *vma = v; >>>>>>>>>>>> struct mem_size_stats mss = {}; >>>>>>>>>>>> + bool thp_eligible; >>>>>>>>>>>> smap_gather_stats(vma, &mss, 0); >>>>>>>>>>>> @@ -882,9 +883,12 @@ static int show_smap(struct seq_file *m, void >>>>>>>>>>>> *v) >>>>>>>>>>>> __show_smap(m, &mss, false); >>>>>>>>>>>> - seq_printf(m, "THPeligible: %8u\n", >>>>>>>>>>>> - !!thp_vma_allowable_orders(vma, vma->vm_flags, >>>>>>>>>>>> - TVA_SMAPS | TVA_ENFORCE_SYSFS, THP_ORDERS_ALL)); >>>>>>>>>>>> + thp_eligible = !!thp_vma_allowable_orders(vma, vma->vm_flags, >>>>>>>>>>>> + TVA_SMAPS | TVA_ENFORCE_SYSFS, THP_ORDERS_ALL); >>>>>>>>>>>> + if (vma_is_anon_shmem(vma)) >>>>>>>>>>>> + thp_eligible = >>>>>>>>>>>> !!shmem_allowable_huge_orders(file_inode(vma->vm_file), >>>>>>>>>>>> + vma, vma->vm_pgoff, thp_eligible); >>>>>>>>>>> >>>>>>>>>>> Afraid I haven't been following the shmem mTHP support work as much as I >>>>>>>>>>> would >>>>>>>>>>> have liked, but is there a reason why we need a separate function for >>>>>>>>>>> shmem? >>>>>>>>>> >>>>>>>>>> Since shmem_allowable_huge_orders() only uses shmem specific logic to >>>>>>>>>> determine >>>>>>>>>> if huge orders are allowable, there is no need to complicate the >>>>>>>>>> thp_vma_allowable_orders() function by adding more shmem related logic, >>>>>>>>>> making >>>>>>>>>> it more bloated. In my view, providing a dedicated helper >>>>>>>>>> shmem_allowable_huge_orders(), specifically for shmem, simplifies the logic. >>>>>>>>> >>>>>>>>> My point was really that a single interface (thp_vma_allowable_orders) >>>>>>>>> should be >>>>>>>>> used to get this information. I have no strong opinon on how the >>>>>>>>> implementation >>>>>>>>> of that interface looks. What you suggest below seems perfectly reasonable >>>>>>>>> to me. >>>>>>>> >>>>>>>> Right. thp_vma_allowable_orders() might require some care as discussed in >>>>>>>> other >>>>>>>> context (cleanly separate dax and shmem handling/orders). But that would be >>>>>>>> follow-up cleanups. >>>>>>> >>>>>>> Are you planning to do that, or do you want me to send a patch? >>>>>> >>>>>> I'm planning on looking into some details, especially the interaction with large >>>>>> folios in the pagecache. I'll let you know once I have a better idea what >>>>>> actually should be done :) >>>>> >>>>> OK great - I'll scrub it from my todo list... really getting things done today :) >>>> >>>> Resolved the khugepaged thiny already? :P >>>> >>>> [khugepaged not active when only enabling the sub-size via the 2M folder IIRC] >>> >>> Hmm... baby brain? >> >> :) >> >> I think I only mentioned it in a private mail at some point. >> >>> >>> Sorry about that. I've been a bit useless lately. For some reason it wasn't on >>> my list, but its there now. Will prioritise it, because I agree it's not good. >> >> >> IIRC, if you do >> >> echo never > /sys/kernel/mm/transparent_hugepage/enabled >> echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled >> >> khugepaged will not get activated. > > khugepaged is controlled by the top level knob. What do you mean by "top level knob"? I assume /sys/kernel/mm/transparent_hugepage/enabled ? If so, that's not really a thing in its own right; its just the legacy PMD-size THP control, and we only take any notice of it if a per-size control is set to "inherit". So if we have: # echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled Then by design, /sys/kernel/mm/transparent_hugepage/enabled should be ignored. > But the above setting > sounds confusing, can we disable the top level knob, but enable it on > a per-order basis? TBH, it sounds weird and doesn't make too much > sense to me. Well that's the design and that's how its documented. It's done this way for back-compat. All controls are now per-size. But at boot, we default all per-size controls to "never" except for the PMD-sized control, which is defaulted to "inherit". That way, an unenlightened user-space can still control PMD-sized THP via the legacy (top-level) control. But enlightened apps can directly control per-size. I'm not sure how your way would work, because you would have 2 controls competing to do the same thing? > >> >> -- >> Cheers, >> >> David / dhildenb >> >>