On Mon, May 9, 2022 at 9:05 AM Vlastimil Babka <vbabka@xxxxxxx> wrote: > > On 4/4/22 22:02, Yang Shi wrote: > > include/linux/huge_mm.h | 14 ++++++++++++ > > include/linux/khugepaged.h | 59 ++++++++++++--------------------------------------- > > include/linux/sched/coredump.h | 3 ++- > > kernel/fork.c | 4 +--- > > mm/huge_memory.c | 15 ++++--------- > > mm/khugepaged.c | 76 +++++++++++++++++++++++++++++++++++++----------------------------- > > mm/mmap.c | 14 ++++++++---- > > mm/shmem.c | 12 ----------- > > 8 files changed, 88 insertions(+), 109 deletions(-) > > Resending my general feedback from mm-commits thread to include the > public ML's: > > There's modestly less lines in the end, some duplicate code removed, > special casing in shmem.c removed, that's all good as it is. Also patch 8/8 > become quite boring in v3, no need to change individual filesystems and also > no hook in fault path, just the common mmap path. So I would just handle > patch 6 differently as I just replied to it, and acked the rest. > > That said it's still unfortunately rather a mess of functions that have > similar names. transhuge_vma_enabled(vma). hugepage_vma_check(vma), > transparent_hugepage_active(vma), transhuge_vma_suitable(vma, addr)? > So maybe still some space for further cleanups. But the series is fine as it > is so we don't have to wait for it now. Yeah, I agree that we do have a lot thp checks. Will find some time to look into it deeper later. > > We could also consider that the tracking of which mm is to be scanned is > modelled after ksm which has its own madvise flag, but also no "always" > mode. What if for THP we only tracked actual THP madvised mm's, and in > "always" mode just scanned all vm's, would that allow ripping out some code > perhaps, while not adding too many unnecessary scans? If some processes are Do you mean add all mm(s) to the scan list unconditionally? I don't think it will scale. > being scanned without any effect, maybe track success separately, and scan > them less frequently etc. That could be ultimately more efficinet than > painfully tracking just *eligibility* for scanning in "always" mode? Sounds like we need a couple of different lists, for example, inactive and active? And promote or demote mm(s) between the two lists? TBH I don't see too many benefits at the moment. Or I misunderstood you? > > Even more radical thing to consider (maybe that's a LSF/MM level topic, too > bad :) is that we scan pagetables in ksm, khugepaged, numa balancing, soon > in MGLRU, and I probably forgot something else. Maybe time to think about > unifying those scanners? We do have pagewalk (walk_page_range()) which is used by a couple of mm stuff, for example, mlock, mempolicy, mprotect, etc. I'm not sure whether it is feasible for khugepaged, ksm, etc, or not since I didn't look that hard. But I agree it should be worth looking at. > >