On Tue, Feb 15, 2022 at 01:38:20PM -0800, Hugh Dickins wrote: > On Tue, 15 Feb 2022, Matthew Wilcox wrote: > > On Mon, Feb 14, 2022 at 06:26:39PM -0800, Hugh Dickins wrote: > > > Add vma argument to mlock_vma_page() and munlock_vma_page(), make them > > > inline functions which check (vma->vm_flags & VM_LOCKED) before calling > > > mlock_page() and munlock_page() in mm/mlock.c. > > > > > > Add bool compound to mlock_vma_page() and munlock_vma_page(): this is > > > because we have understandable difficulty in accounting pte maps of THPs, > > > and if passed a PageHead page, mlock_page() and munlock_page() cannot > > > tell whether it's a pmd map to be counted or a pte map to be ignored. > > > > > [...] > > > > > > Mlock accounting on THPs has been hard to define, differed between anon > > > and file, involved PageDoubleMap in some places and not others, required > > > clear_page_mlock() at some points. Keep it simple now: just count the > > > pmds and ignore the ptes, there is no reason for ptes to undo pmd mlocks. > > > > How would you suggest we handle the accounting for folios which are > > intermediate in size between PMDs and PTEs? eg, an order-4 page? > > Would it make sense to increment mlock_count by HUGE_PMD_NR for > > each PMD mapping and by 1 for each PTE mapping? > > I think you're asking the wrong question here, but perhaps you've > already decided there's only one satisfactory answer to the right question. Or I've gravely misunderstood the situation. Or explained my concern badly. The possibilities are endless! My concern is that a filesystem may create an order-4 folio, an application mmaps the folio and then calls mlock() (either over a portion or the entirety of the folio). As far as I can tell, we then do not move the folio onto the unevictable list because it is of order >0 and is only mapped by PTEs. This presumably then has performance problems (or we wouldn't need to have an unevictable list in the first place). > The question I thought you should be asking is about how to count them > in Mlocked. That's tough; but I take it for granted that you would not > want per-subpage flags and counts involved (or not unless forced to do > so by some regression that turns out to matter). And I think the only > satisfactory answer is to count the whole compound_nr() as Mlocked > when any part of it (a single pte, a series of ptes, a pmd) is mlocked; > and (try to) move folio to Unevictable whenever any part of it is mlocked. I think that makes sense. As with so many other things, we choose to manage memory in >PAGE_SIZE chunks. If you mlock() a part of a folio, we lock the whole folio in memory, and it all counts as being locked.