On 2/6/22 22:40, Hugh Dickins wrote: > Previous patches have been preparatory: now implement page->mlock_count. > The ordering of the "Unevictable LRU" is of no significance, and there is > no point holding unevictable pages on a list: place page->mlock_count to > overlay page->lru.prev (since page->lru.next is overlaid by compound_head, > which needs to be even so as not to satisfy PageTail - though 2 could be > added instead of 1 for each mlock, if that's ever an improvement). > > But it's only safe to rely on or modify page->mlock_count while lruvec > lock is held and page is on unevictable "LRU" - we can save lots of edits > by continuing to pretend that there's an imaginary LRU here (there is an > unevictable count which still needs to be maintained, but not a list). > > The mlock_count technique suffers from an unreliability much like with > page_mlock(): while someone else has the page off LRU, not much can > be done. As before, err on the safe side (behave as if mlock_count 0), > and let try_to_unlock_one() move the page to unevictable if reclaim finds > out later on - a few misplaced pages don't matter, what we want to avoid > is imbalancing reclaim by flooding evictable lists with unevictable pages. > > I am not a fan of "if (!isolate_lru_page(page)) putback_lru_page(page);": > if we have taken lruvec lock to get the page off its present list, then > we save everyone trouble (and however many extra atomic ops) by putting > it on its destination list immediately. Good point. > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > --- > include/linux/mm_inline.h | 11 +++++-- > include/linux/mm_types.h | 19 +++++++++-- > mm/huge_memory.c | 5 ++- > mm/memcontrol.c | 3 +- > mm/mlock.c | 68 +++++++++++++++++++++++++++++++-------- > mm/mmzone.c | 7 ++++ > mm/swap.c | 1 + > 7 files changed, 92 insertions(+), 22 deletions(-) > > diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h > index b725839dfe71..884d6f6af05b 100644 > --- a/include/linux/mm_inline.h > +++ b/include/linux/mm_inline.h > @@ -99,7 +99,8 @@ void lruvec_add_folio(struct lruvec *lruvec, struct folio *folio) > > update_lru_size(lruvec, lru, folio_zonenum(folio), > folio_nr_pages(folio)); > - list_add(&folio->lru, &lruvec->lists[lru]); > + if (lru != LRU_UNEVICTABLE) > + list_add(&folio->lru, &lruvec->lists[lru]); > } > > static __always_inline void add_page_to_lru_list(struct page *page, > @@ -115,6 +116,7 @@ void lruvec_add_folio_tail(struct lruvec *lruvec, struct folio *folio) > > update_lru_size(lruvec, lru, folio_zonenum(folio), > folio_nr_pages(folio)); > + /* This is not expected to be used on LRU_UNEVICTABLE */ Felt uneasy about this at first because it's just a _tail version of lruvec_add_folio, and there's probably nothing fundamental about the users of _tail to not encounter unevictable pages. But if the assumption is ever violated, the poisoned list head should make it immediately clear, so I guess that's fine. > list_add_tail(&folio->lru, &lruvec->lists[lru]); > } > > @@ -127,8 +129,11 @@ static __always_inline void add_page_to_lru_list_tail(struct page *page, > static __always_inline > void lruvec_del_folio(struct lruvec *lruvec, struct folio *folio) > { > - list_del(&folio->lru); > - update_lru_size(lruvec, folio_lru_list(folio), folio_zonenum(folio), > + enum lru_list lru = folio_lru_list(folio); > + > + if (lru != LRU_UNEVICTABLE) > + list_del(&folio->lru); > + update_lru_size(lruvec, lru, folio_zonenum(folio), > -folio_nr_pages(folio)); > } >