Re: [PATCH 16/18] mm: memcontrol: charge swapin pages on instantiation

Joonsoo Kim <js1304@xxxxxxxxx> · Tue, 28 Apr 2020 15:49:41 +0900

2020년 4월 24일 (금) 오전 11:51, Johannes Weiner <hannes@xxxxxxxxxxx>님이 작성:
>
> On Fri, Apr 24, 2020 at 09:44:42AM +0900, Joonsoo Kim wrote:
> > On Mon, Apr 20, 2020 at 06:11:24PM -0400, Johannes Weiner wrote:
> > > @@ -412,31 +407,43 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> > >                      */
> > >                     cond_resched();
> > >                     continue;
> > > -           } else if (err)         /* swp entry is obsolete ? */
> > > -                   break;
> > > -
> > > -           /* May fail (-ENOMEM) if XArray node allocation failed. */
> > > -           __SetPageLocked(new_page);
> > > -           __SetPageSwapBacked(new_page);
> > > -           err = add_to_swap_cache(new_page, entry, gfp_mask & GFP_KERNEL);
> > > -           if (likely(!err)) {
> > > -                   /* Initiate read into locked page */
> > > -                   SetPageWorkingset(new_page);
> > > -                   lru_cache_add_anon(new_page);
> > > -                   *new_page_allocated = true;
> > > -                   return new_page;
> > >             }
> > > -           __ClearPageLocked(new_page);
> > > -           /*
> > > -            * add_to_swap_cache() doesn't return -EEXIST, so we can safely
> > > -            * clear SWAP_HAS_CACHE flag.
> > > -            */
> > > -           put_swap_page(new_page, entry);
> > > -   } while (err != -ENOMEM);
> > > +           if (err)                /* swp entry is obsolete ? */
> > > +                   return NULL;
> >
> > "if (err)" is not needed since "!err" is already exiting the loop.
>
> But we don't want to leave the loop, we want to leave the
> function. For example, if swapcache_prepare() says the entry is gone
> (-ENOENT), we don't want to exit the loop and allocate a page for it.

Yes, so I said "if (err)" is not needed.
Just "return NULL;" would be enough.

> > > +
> > > +   /*
> > > +    * The swap entry is ours to swap in. Prepare a new page.
> > > +    */
> > > +
> > > +   page = alloc_page_vma(gfp_mask, vma, addr);
> > > +   if (!page)
> > > +           goto fail_free;
> > > +
> > > +   __SetPageLocked(page);
> > > +   __SetPageSwapBacked(page);
> > > +
> > > +   /* May fail (-ENOMEM) if XArray node allocation failed. */
> > > +   if (add_to_swap_cache(page, entry, gfp_mask & GFP_KERNEL))
> > > +           goto fail_unlock;
> > > +
> > > +   if (mem_cgroup_charge(page, NULL, gfp_mask & GFP_KERNEL, false))
> > > +           goto fail_delete;
> > > +
> >
> > I think that following order of operations is better than yours.
> >
> > 1. page alloc
> > 2. memcg charge
> > 3. swapcache_prepare
> > 4. add_to_swap_cache
> >
> > Reason is that page allocation and memcg charging could take for a
> > long time due to reclaim and other tasks waiting this swapcache page
> > could be blocked inbetween swapcache_prepare() and add_to_swap_cache().
>
> I see how that would be preferable, but memcg charging actually needs
> the swap(cache) information to figure out the cgroup that owned it at
> swapout, then uncharge the swapcache and drop the swap cgroup record.
>
> Maybe it could be done, but I'm not sure that level of surgery would
> be worth the benefits? Whoever else would be trying to swap the page
> in at the same time is likely in the same memory situation, and would
> not necessarily be able to allocate pages any faster.

Hmm, at least, some modifications are needed since waiting task would do
busy waiting in the loop and it wastes overall system cpu time.

I still think that changing operation order is better since it's possible that
later task allocates a page faster though it's not usual case. However, I
also agree your reasoning so will not insist more.

Thanks.