On 13.07.22 19:58, Mike Kravetz wrote: > On 07/13/22 16:00, David Hildenbrand wrote: >> On 08.07.22 21:36, Khalid Aziz wrote: >>> On 7/8/22 05:47, David Hildenbrand wrote: >>>> On 02.07.22 06:24, Andrew Morton wrote: >>>>> On Wed, 29 Jun 2022 16:53:51 -0600 Khalid Aziz <khalid.aziz@xxxxxxxxxx> wrote: >> >>> suggestion to extend hugetlb PMD sharing was discussed briefly. Conclusion from that discussion and earlier discussion >>> on mailing list was hugetlb PMD sharing is built with special case code in too many places in the kernel and it is >>> better to replace it with something more general purpose than build even more on it. Mike can correct me if I got that >>> wrong. >> >> Yes, I pushed for the removal of that yet-another-hugetlb-special-stuff, >> and asked the honest question if we can just remove it and replace it by >> something generic in the future. And as I learned, we most probably >> cannot rip that out without affecting existing user space. Even >> replacing it by mshare() would degrade existing user space. >> >> So the natural thing to reduce page table consumption (again, what this >> cover letter talks about) for user space (semi- ?)automatically for >> MAP_SHARED files is to factor out what hugetlb has, and teach generic MM >> code to cache and reuse page tables (PTE and PMD tables should be >> sufficient) where suitable. >> >> For reasonably aligned mappings and mapping sizes, it shouldn't be too >> hard (I know, locking ...), to cache and reuse page tables attached to >> files -- similar to what hugetlb does, just in a generic way. We might >> want a mechanism to enable/disable this for specific processes and/or >> VMAs, but these are minor details. >> >> And that could come for free for existing user space, because page >> tables, and how they are handled, would just be an implementation detail. >> >> >> I'd be really interested into what the major roadblocks/downsides >> file-based page table sharing has. Because I am not convinced that a >> mechanism like mshare() -- that has to be explicitly implemented+used by >> user space -- is required for that. > > Perhaps this is an 'opportunity' for me to write up in detail how > hugetlb pmd sharing works. As you know, I have been struggling with > keeping that working AND safe AND performant. Yes, and I have your locking-related changes in my inbox marked as "to be reviewed" :D Sheding some light on that would be highly appreciated, especially, how hugetlb-specific it currently is and for which reason. -- Thanks, David / dhildenb