Re: [PATCH] mm: account lazily freed anon pages in NR_FILE_PAGES

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 5, 2020 at 9:35 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Thu 05-11-20 21:10:12, Yafang Shao wrote:
> > The memory utilization (Used / Total) is used to monitor the memory
> > pressure by us. If it is too high, it means the system may be OOM sooner
> > or later when swap is off, then we will make adjustment on this system.
> >
> > However, this method is broken since MADV_FREE is introduced, because
> > these lazily free anonymous can be reclaimed under memory pressure while
> > they are still accounted in NR_ANON_MAPPED.
> >
> > Furthermore, since commit f7ad2a6cb9f7 ("mm: move MADV_FREE pages into
> > LRU_INACTIVE_FILE list"), these lazily free anonymous pages are moved
> > from anon lru list into file lru list. That means
> > (Inactive(file) + Active(file)) may be much larger than Cached in
> > /proc/meminfo. That makes our users confused.
> >
> > So we'd better account the lazily freed anonoymous pages in
> > NR_FILE_PAGES as well.
>
> Can you simply subtract lazyfree pages in the userspace?

Could you pls. tell me how to subtract lazyfree pages in the userspace?
Pls. note that we can't use (pglazyfree - pglazyfreed) because
pglazyfreed is only counted in the regular reclaim path while the
process exit path is not counted, that means we have to introduce
another counter like LazyPage....

> I am afraid your
> patch just makes the situation even more muddy. NR_ANON_MAPPED is really
> meant to tell how many anonymous pages are mapped. And MADV_FREE pages
> are mapped until they are freed. NR_*_FILE are reflecting size of LRU
> lists and NR_FILE_PAGES reflects the number of page cache pages but
> madvfree pages are not a page cache. They are aged together with file
> pages but they are not the same thing. Same like shmem pages are page
> cache that is living on anon LRUs.
>
> Confusing? Tricky? Yes, likely. But I do not think we want to bend those
> counters even further.
>
> > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
> > Cc: Minchan Kim <minchan@xxxxxxxxxx>
> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
> > Cc: Michal Hocko <mhocko@xxxxxxxx>
> > ---
> >  mm/memcontrol.c | 11 +++++++++--
> >  mm/rmap.c       | 26 ++++++++++++++++++--------
> >  mm/swap.c       |  2 ++
> >  mm/vmscan.c     |  2 ++
> >  4 files changed, 31 insertions(+), 10 deletions(-)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 3dcbf24d2227..217a6f10fa8d 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -5659,8 +5659,15 @@ static int mem_cgroup_move_account(struct page *page,
> >
> >       if (PageAnon(page)) {
> >               if (page_mapped(page)) {
> > -                     __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages);
> > -                     __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages);
> > +                     if (!PageSwapBacked(page) && !PageSwapCache(page) &&
> > +                         !PageUnevictable(page)) {
> > +                             __mod_lruvec_state(from_vec, NR_FILE_PAGES, -nr_pages);
> > +                             __mod_lruvec_state(to_vec, NR_FILE_PAGES, nr_pages);
> > +                     } else {
> > +                             __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages);
> > +                             __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages);
> > +                     }
> > +
> >                       if (PageTransHuge(page)) {
> >                               __mod_lruvec_state(from_vec, NR_ANON_THPS,
> >                                                  -nr_pages);
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 1b84945d655c..690ca7ff2392 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1312,8 +1312,13 @@ static void page_remove_anon_compound_rmap(struct page *page)
> >       if (unlikely(PageMlocked(page)))
> >               clear_page_mlock(page);
> >
> > -     if (nr)
> > -             __mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr);
> > +     if (nr) {
> > +             if (PageLRU(page) && PageAnon(page) && !PageSwapBacked(page) &&
> > +                 !PageSwapCache(page) && !PageUnevictable(page))
> > +                     __mod_lruvec_page_state(page, NR_FILE_PAGES, -nr);
> > +             else
> > +                     __mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr);
> > +     }
> >  }
> >
> >  /**
> > @@ -1341,12 +1346,17 @@ void page_remove_rmap(struct page *page, bool compound)
> >       if (!atomic_add_negative(-1, &page->_mapcount))
> >               goto out;
> >
> > -     /*
> > -      * We use the irq-unsafe __{inc|mod}_zone_page_stat because
> > -      * these counters are not modified in interrupt context, and
> > -      * pte lock(a spinlock) is held, which implies preemption disabled.
> > -      */
> > -     __dec_lruvec_page_state(page, NR_ANON_MAPPED);
> > +     if (PageLRU(page) && PageAnon(page) && !PageSwapBacked(page) &&
> > +         !PageSwapCache(page) && !PageUnevictable(page)) {
> > +             __dec_lruvec_page_state(page, NR_FILE_PAGES);
> > +     } else {
> > +             /*
> > +              * We use the irq-unsafe __{inc|mod}_zone_page_stat because
> > +              * these counters are not modified in interrupt context, and
> > +              * pte lock(a spinlock) is held, which implies preemption disabled.
> > +              */
> > +             __dec_lruvec_page_state(page, NR_ANON_MAPPED);
> > +     }
> >
> >       if (unlikely(PageMlocked(page)))
> >               clear_page_mlock(page);
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 47a47681c86b..340c5276a0f3 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -601,6 +601,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec,
> >
> >               del_page_from_lru_list(page, lruvec,
> >                                      LRU_INACTIVE_ANON + active);
> > +             __mod_lruvec_state(lruvec, NR_ANON_MAPPED, -nr_pages);
> >               ClearPageActive(page);
> >               ClearPageReferenced(page);
> >               /*
> > @@ -610,6 +611,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec,
> >                */
> >               ClearPageSwapBacked(page);
> >               add_page_to_lru_list(page, lruvec, LRU_INACTIVE_FILE);
> > +             __mod_lruvec_state(lruvec, NR_FILE_PAGES, nr_pages);
> >
> >               __count_vm_events(PGLAZYFREE, nr_pages);
> >               __count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE,
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 1b8f0e059767..4821124c70f7 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1428,6 +1428,8 @@ static unsigned int shrink_page_list(struct list_head *page_list,
> >                               goto keep_locked;
> >                       }
> >
> > +                     mod_lruvec_page_state(page, NR_ANON_MAPPED, nr_pages);
> > +                     mod_lruvec_page_state(page, NR_FILE_PAGES, -nr_pages);
> >                       count_vm_event(PGLAZYFREED);
> >                       count_memcg_page_event(page, PGLAZYFREED);
> >               } else if (!mapping || !__remove_mapping(mapping, page, true,
> > --
> > 2.18.4
> >
>
> --
> Michal Hocko
> SUSE Labs



-- 
Thanks
Yafang




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux