Re: [RFC PATCH 07/14] mm/khugepaged: add vm_flags_ignore to hugepage_vma_revalidate_pmd_count()

David Rientjes <rientjes@xxxxxxxxxx> · Thu, 10 Mar 2022 10:46:22 -0800 (PST)

On Thu, 10 Mar 2022, Yang Shi wrote:

> > This separates "async-hint" vs "sync-explicit" madvise requests.
> > MADV_[NO]HUGEPAGE are hints, and together with thp settings, advise
> > the kernel how to treat memory in the future. The kernel uses
> > VM_[NO]HUGEPAGE to aid with this. MADV_COLLAPSE, as an explicit
> > request, is free to define its own defrag semantics.
> >
> > This would allow flexibility to separately define async vs sync thp
> > policies. For example, highly tuned userspace applications that are
> > sensitive to unexpected latency might want to manage their hugepages
> > utilization themselves, and ask khugepaged to stay away. There is no
> > way in "always" mode to do this without setting VM_NOHUGEPAGE.
> 
> I don't quite get why you set THP to always but don't want to
> khugepaged do its job. It may be slow, I think this is why you
> introduce MADV_COLLAPSE, right? But it doesn't mean khugepaged can't
> scan the same area, it just doesn't do any real work and waste some
> cpu cycles. But I guess MADV_COLLAPSE doesn't prevent the PMD/THP from
> being split, right? So khugepaged still plays a role to re-collapse
> the area without calling MADV_COLLAPSE over again and again.
> 

My only real concern for MADV_COLLAPSE was when the span being collapsed 
includes a mixture of both VM_HUGEPAGE and VM_NOHUGEPAGE.  Does this 
collapse over the eligible memory or does it fail entirely?

I'd think it was the former, that we should respect VM_NOHUGEPAGE and only 
collapse eligible memory when doing MADV_COLLAPSE but now userspace 
struggles to know whether it was a partial collapse because of 
ineligiblity or because we just couldn't allocate a hugepage.

It has the information to figure this out on its own, so given the use of 
VM_NOHUGEPAGE for non-MADV_NOHUGEPAGE purposes, I think it makes sense to 
simply ignore these vmas as part of the collapse request.