Hi Hugh, On Wed, Feb 9, 2022 at 3:46 AM Hugh Dickins <hughd@xxxxxxxxxx> wrote: > A weakness of the page->mlock_count approach is the need for lruvec lock > while holding page table lock. That is not an overhead we would allow on > normal pages, but I think acceptable just for pages in an mlocked area. > But let's try to amortize the extra cost by gathering on per-cpu pagevec > before acquiring the lruvec lock. > > I have an unverified conjecture that the mlock pagevec might work out > well for delaying the mlock processing of new file pages until they have > got off lru_cache_add()'s pagevec and on to LRU. > > The initialization of page->mlock_count is subject to races and awkward: > 0 or !!PageMlocked or 1? Was it wrong even in the implementation before > this commit, which just widens the window? I haven't gone back to think > it through. Maybe someone can point out a better way to initialize it. > > Bringing lru_cache_add_inactive_or_unevictable()'s mlock initialization > into mm/mlock.c has helped: mlock_new_page(), using the mlock pagevec, > rather than lru_cache_add()'s pagevec. > > Experimented with various orderings: the right thing seems to be for > mlock_page() and mlock_new_page() to TestSetPageMlocked before adding to > pagevec, but munlock_page() to leave TestClearPageMlocked to the later > pagevec processing. > > Dropped the VM_BUG_ON_PAGE(PageTail)s this time around: they have made > their point, and the thp_nr_page()s already contain a VM_BUG_ON_PGFLAGS() > for that. > > This still leaves acquiring lruvec locks under page table lock each time > the pagevec fills (or a THP is added): which I suppose is rather silly, > since they sit on pagevec waiting to be processed long after page table > lock has been dropped; but I'm disinclined to uglify the calling sequence > until some load shows an actual problem with it (nothing wrong with > taking lruvec lock under page table lock, just "nicer" to do it less). > > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> Thanks for your patch, which is now commit cbaf47432c909044 ("mm/munlock: mlock_page() munlock_page() batch by pagevec") in next-20220209. > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -402,7 +402,8 @@ extern int mlock_future_check(struct mm_struct *mm, unsigned long flags, > * > * mlock is usually called at the end of page_add_*_rmap(), > * munlock at the end of page_remove_rmap(); but new anon > - * pages are managed in lru_cache_add_inactive_or_unevictable(). > + * pages are managed by lru_cache_add_inactive_or_unevictable() > + * calling mlock_new_page(). > * > * @compound is used to include pmd mappings of THPs, but filter out > * pte mappings of THPs, which cannot be consistently counted: a pte > @@ -425,6 +426,9 @@ static inline void munlock_vma_page(struct page *page, > (compound || !PageTransCompound(page))) > munlock_page(page); > } > +void mlock_new_page(struct page *page); > +bool need_mlock_page_drain(int cpu); > +void mlock_page_drain(int cpu); This is inside an #ifdef CONFIG_MMU section. > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -640,6 +634,7 @@ void lru_add_drain_cpu(int cpu) > pagevec_lru_move_fn(pvec, lru_lazyfree_fn); > > activate_page_drain(cpu); > + mlock_page_drain(cpu); noreply@xxxxxxxxxxxxxx reported for m5272c3_defconfig: mm/swap.c:637:2: error: implicit declaration of function ‘mlock_page_drain’ [-Werror=implicit-function-declaration] http://kisskb.ellerman.id.au/kisskb/buildresult/14694567/ Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds