Hi Nhat, > -----Original Message----- > From: Nhat Pham <nphamcs@xxxxxxxxx> > Sent: Tuesday, August 20, 2024 2:14 PM > To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx> > Cc: Huang, Ying <ying.huang@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; linux- > mm@xxxxxxxxx; hannes@xxxxxxxxxxx; yosryahmed@xxxxxxxxxx; > ryan.roberts@xxxxxxx; 21cnbao@xxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; > Zou, Nanhai <nanhai.zou@xxxxxxxxx>; Feghali, Wajdi K > <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh <vinodh.gopal@xxxxxxxxx> > Subject: Re: [PATCH v4 0/4] mm: ZSWAP swap-out of mTHP folios > > On Mon, Aug 19, 2024 at 11:01 PM Sridhar, Kanchana P > <kanchana.p.sridhar@xxxxxxxxx> wrote: > > > > Hi Ying, > > > > I confirmed that in the case of usemem, all calls to [1] occur from the code > path in [3]. > > However, my takeaway from this is that the more reclaim that results in > zswap_store(), > > for e.g., from mTHP folios, there is higher likelihood of overage recorded > per-process in > > current->memcg_nr_pages_over_high, which could potentially be causing > each > > process to reclaim memory, even if it is possible that the swapout from a > few of > > the 70 processes could have brought the parent cgroup under the limit. > > Yeah IIUC, the memory increase from zswap store happens > immediately/synchronously (swap_writepage() -> zswap_store() -> > obj_cgroup_charge_zswap()), before the memory saving kicks in. This is > a non-issue for swap - the memory saving doesn't happen right away, > but it also doesn't increase memory usage (well, as you pointed out, > obj_cgroup_charge_zswap() doesn't even happen). > > And yes, this is compounded a) if you're in a high concurrency regime, > where all tasks in the same cgroup, under memory pressure, all go into > reclaim. and b) for larger folios, where we compress multiple pages > before the saving happens. I wonder how bad the effect is tho - could > you quantify the reclamation amount that happens per zswap store > somehow with tracing magic? Thanks very much for the detailed comments and explanations! Sure, I will gather data on the reclamation amount that happens per zswap store and share. > > Also, I wonder if there is a "charge delta" mechanism, where we > directly uncharge by (page size - zswap object size), to avoid the > temporary double charging... Sort of like what folio migration is > doing now v.s what it used to do. Seems complicated - not even sure if > it's possible TBH. Yes, this is a very interesting idea. I will also look into the feasibility of doing this in the shrink_folio_list()->swap_writepage()->zswap_store() path. Thanks again for the discussion, really appreciate it. Thanks, Kanchana > > > > > Please do let me know if you have any other questions. Appreciate your > feedback > > and comments. > > > > Thanks, > > Kanchana