[...] >>>>> we don't want reclamation overhead later. and we want memories immediately >>>>> available to others. >>>> >>>> But by that logic, you also don't want to leave the large folio partially mapped >>>> all the way until the last subpage is CoWed. Surely you would want to reclaim it >>>> when you reach partial map status? >>> >>> To some extent, I agree. But then we will have two many copies. The last >>> subpage is small, and a safe place to copy instead. >>> >>> We actually had to tune userspace to decrease partial map as too much >>> partial map both unfolded CONT-PTE and wasted too much memory. if a >>> vma had too much partial map, we disabled mTHP on this VMA. >> >> I actually had a whacky idea around introducing selectable page size ABI >> per-process that might help here. I know Android is doing work to make the >> system 16K page compatible. You could run most of the system processes with 16K >> ABI on top of 4K kernel. Then those processes don't even have the ability to >> madvise/munmap/mprotect/mremap anything less than 16K alignment so that acts as >> an anti-fragmentation mechanism while allowing non-16K capable processes to run >> side-by-side. Just a passing thought... > > Right, this project faces a challenge in supporting legacy > 4KiB-aligned applications. > but I don't find it will be an issue to run 16KiB-aligned applications > on a kernel whose > page size is 4KiB. Yes, agreed that a 16K-aligned (or 64K-aligned) app will work without issue on 4K kernel, but it will also use getpagesize() and know what the page size is. I'm suggesting you could actually run these apps on a 4K kernel but with a 16K ABI and potentially get close to the native 16K performance out of them. It's just a thought though - I don't have any data that actually shows this is better than just running on a 4K kernel with a 4K ABI, and using 16K or 64K mTHP opportunistically.