RE: [PATCH v4 0/4] mm: ZSWAP swap-out of mTHP folios

"Sridhar, Kanchana P" <kanchana.p.sridhar@xxxxxxxxx> · Tue, 20 Aug 2024 22:09:24 +0000

Hi Nhat,

> -----Original Message-----
> From: Nhat Pham <nphamcs@xxxxxxxxx>
> Sent: Tuesday, August 20, 2024 2:14 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx>
> Cc: Huang, Ying <ying.huang@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; linux-
> mm@xxxxxxxxx; hannes@xxxxxxxxxxx; yosryahmed@xxxxxxxxxx;
> ryan.roberts@xxxxxxx; 21cnbao@xxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx;
> Zou, Nanhai <nanhai.zou@xxxxxxxxx>; Feghali, Wajdi K
> <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh <vinodh.gopal@xxxxxxxxx>
> Subject: Re: [PATCH v4 0/4] mm: ZSWAP swap-out of mTHP folios
> 
> On Mon, Aug 19, 2024 at 11:01 PM Sridhar, Kanchana P
> <kanchana.p.sridhar@xxxxxxxxx> wrote:
> >
> > Hi Ying,
> >
> > I confirmed that in the case of usemem, all calls to [1] occur from the code
> path in [3].
> > However, my takeaway from this is that the more reclaim that results in
> zswap_store(),
> > for e.g., from mTHP folios, there is higher likelihood of overage recorded
> per-process in
> > current->memcg_nr_pages_over_high, which could potentially be causing
> each
> > process to reclaim memory, even if it is possible that the swapout from a
> few of
> > the 70 processes could have brought the parent cgroup under the limit.
> 
> Yeah IIUC, the memory increase from zswap store happens
> immediately/synchronously (swap_writepage() -> zswap_store() ->
> obj_cgroup_charge_zswap()), before the memory saving kicks in. This is
> a non-issue for swap - the memory saving doesn't happen right away,
> but it also doesn't increase memory usage (well, as you pointed out,
> obj_cgroup_charge_zswap() doesn't even happen).
> 
> And yes, this is compounded a) if you're in a high concurrency regime,
> where all tasks in the same cgroup, under memory pressure, all go into
> reclaim. and b) for larger folios, where we compress multiple pages
> before the saving happens. I wonder how bad the effect is tho - could
> you quantify the reclamation amount that happens per zswap store
> somehow with tracing magic?

Thanks very much for the detailed comments and explanations!
Sure, I will gather data on the reclamation amount that happens per
zswap store and share.

> 
> Also, I wonder if there is a "charge delta" mechanism, where we
> directly uncharge by (page size - zswap object size), to avoid the
> temporary double charging... Sort of like what folio migration is
> doing now v.s what it used to do. Seems complicated - not even sure if
> it's possible TBH.

Yes, this is a very interesting idea. I will also look into the feasibility of
doing this in the shrink_folio_list()->swap_writepage()->zswap_store()
path.

Thanks again for the discussion, really appreciate it.

Thanks,
Kanchana

> 
> >
> > Please do let me know if you have any other questions. Appreciate your
> feedback
> > and comments.
> >
> > Thanks,
> > Kanchana