On Mon, Aug 19, 2024 at 11:01 PM Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx> wrote: > > Hi Ying, > > I confirmed that in the case of usemem, all calls to [1] occur from the code path in [3]. > However, my takeaway from this is that the more reclaim that results in zswap_store(), > for e.g., from mTHP folios, there is higher likelihood of overage recorded per-process in > current->memcg_nr_pages_over_high, which could potentially be causing each > process to reclaim memory, even if it is possible that the swapout from a few of > the 70 processes could have brought the parent cgroup under the limit. Yeah IIUC, the memory increase from zswap store happens immediately/synchronously (swap_writepage() -> zswap_store() -> obj_cgroup_charge_zswap()), before the memory saving kicks in. This is a non-issue for swap - the memory saving doesn't happen right away, but it also doesn't increase memory usage (well, as you pointed out, obj_cgroup_charge_zswap() doesn't even happen). And yes, this is compounded a) if you're in a high concurrency regime, where all tasks in the same cgroup, under memory pressure, all go into reclaim. and b) for larger folios, where we compress multiple pages before the saving happens. I wonder how bad the effect is tho - could you quantify the reclamation amount that happens per zswap store somehow with tracing magic? Also, I wonder if there is a "charge delta" mechanism, where we directly uncharge by (page size - zswap object size), to avoid the temporary double charging... Sort of like what folio migration is doing now v.s what it used to do. Seems complicated - not even sure if it's possible TBH. > > Please do let me know if you have any other questions. Appreciate your feedback > and comments. > > Thanks, > Kanchana