On Fri, Mar 19, 2021 at 01:44:55PM +0100, David Hildenbrand wrote: > On 19.03.21 00:53, Balbir Singh wrote: > > On Thu, Mar 18, 2021 at 05:57:06PM +0100, Vlastimil Babka wrote: > > > On 3/11/21 7:14 PM, David Hildenbrand wrote: > > > > Hi folks, > > > > > > > > I was wondering, is there any mechanism that reclaims basically empty page > > > > tables in a running process? > > > > > > > > Like: When I MADV_DONTNEED a huge range, there could be plenty of basically > > > > empty (e.g., all entries invalid) page tables we could reclaim. As soon as we > > > > zap a complete PMD we could reclaim (depending on the architecture) a whole page. > > > > > > > > Zapping on the PMD level might make most impact I guess. > > > > > > > > For 1 GB, we need 262144 4k pages. If we assume each PTE is 8 bytes, we need a > > > > total of 8 MB for the lowest level page tables (PTE). > > > > > > > > OTOH, we would need 512 PMD entries - a single 4k page. Zapping 1 TB would mean > > > > we can free up another 4MB - rather a corner case and we can live with that. > > > > > > > > > > > > Of course, the same might apply to other cases where we can restore all page > > > > table content from the VMA again. One example would be after MADV_FREE zapped a > > > > whole range of entries we marked. > > > > > > I don't think we have such mechanism, but IIRC I've heard the idea mentioned > > > before, probably from Michal Hocko. Definitely an interesting research project > > > idea to evaluate the cost vs benefits of that. > > > > > > > It might lead to interesting interactions with lockless page table walking > > with implications on the mmap_lock as well. > > > > I think if lockless page table walks have to be able with THP code swapping > populated page tables by a PMD back and forth, swapping an unpopulated page > table by an invalid PMD entry might be quite similar. At least it feels like > both approaches would rely on similar mechanisms / locking. :) > Yes, but then I suspect you always need destruct page tables by RCU. > I'm planning on looking into this, but not sure when I'll have time to > prototype something up. > > Thanks, Balbir Singh.