On 07/13/22 16:00, David Hildenbrand wrote: > On 08.07.22 21:36, Khalid Aziz wrote: > > On 7/8/22 05:47, David Hildenbrand wrote: > >> On 02.07.22 06:24, Andrew Morton wrote: > >>> On Wed, 29 Jun 2022 16:53:51 -0600 Khalid Aziz <khalid.aziz@xxxxxxxxxx> wrote: > > > suggestion to extend hugetlb PMD sharing was discussed briefly. Conclusion from that discussion and earlier discussion > > on mailing list was hugetlb PMD sharing is built with special case code in too many places in the kernel and it is > > better to replace it with something more general purpose than build even more on it. Mike can correct me if I got that > > wrong. > > Yes, I pushed for the removal of that yet-another-hugetlb-special-stuff, > and asked the honest question if we can just remove it and replace it by > something generic in the future. And as I learned, we most probably > cannot rip that out without affecting existing user space. Even > replacing it by mshare() would degrade existing user space. > > So the natural thing to reduce page table consumption (again, what this > cover letter talks about) for user space (semi- ?)automatically for > MAP_SHARED files is to factor out what hugetlb has, and teach generic MM > code to cache and reuse page tables (PTE and PMD tables should be > sufficient) where suitable. > > For reasonably aligned mappings and mapping sizes, it shouldn't be too > hard (I know, locking ...), to cache and reuse page tables attached to > files -- similar to what hugetlb does, just in a generic way. We might > want a mechanism to enable/disable this for specific processes and/or > VMAs, but these are minor details. > > And that could come for free for existing user space, because page > tables, and how they are handled, would just be an implementation detail. > > > I'd be really interested into what the major roadblocks/downsides > file-based page table sharing has. Because I am not convinced that a > mechanism like mshare() -- that has to be explicitly implemented+used by > user space -- is required for that. Perhaps this is an 'opportunity' for me to write up in detail how hugetlb pmd sharing works. As you know, I have been struggling with keeping that working AND safe AND performant. Who knows, this may lead to changes in the existing implementation. -- Mike Kravetz