> -----Original Message----- > From: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> > Sent: Tuesday, January 28, 2025 10:40 PM > To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx>; Yosry Ahmed > <yosryahmed@xxxxxxxxxx>; Nhat Pham <nphamcs@xxxxxxxxx>; Chengming > Zhou <chengming.zhou@xxxxxxxxx>; Andrew Morton <akpm@linux- > foundation.org>; linux-mm@xxxxxxxxx; stable@xxxxxxxxxxxxxxx > Subject: Re: [PATCH v2 mm-hotfixes] mm/zswap: fix inconsistent charging > when zswap_store_page() fails > > On Wed, Jan 29, 2025 at 4:09 AM Sridhar, Kanchana P > <kanchana.p.sridhar@xxxxxxxxx> wrote: > > > > Hi Hyeonggon, > > > > > -----Original Message----- > > > From: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> > > > Sent: Tuesday, January 28, 2025 10:55 AM > > > To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx>; Johannes > Weiner > > > <hannes@xxxxxxxxxxx>; Yosry Ahmed <yosryahmed@xxxxxxxxxx>; Nhat > > > Pham <nphamcs@xxxxxxxxx>; Chengming Zhou > > > <chengming.zhou@xxxxxxxxx>; Andrew Morton <akpm@linux- > > > foundation.org> > > > Cc: linux-mm@xxxxxxxxx; Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>; > > > stable@xxxxxxxxxxxxxxx > > > Subject: [PATCH v2 mm-hotfixes] mm/zswap: fix inconsistent charging > when > > > zswap_store_page() fails > > > > > > Commit b7c0ccdfbafd ("mm: zswap: support large folios in > zswap_store()") > > > skips charging any zswapped base pages when it failed to zswap the > entire > > > folio. > > > > > > However, when some base pages are zswapped but it failed to zswap > > > the entire folio, the zswap operation is rolled back. > > > When freeing zswap entries for those pages, zswap_entry_free() > uncharges > > > the pages that were not previously charged, causing zswap charging to > > > become inconsistent. > > > > > > This inconsistency triggers two warnings with following steps: > > > # On a machine with 64GiB of RAM and 36GiB of zswap > > > $ stress-ng --bigheap 2 # wait until the OOM-killer kills stress-ng > > > $ sudo reboot > > > > > > Two warnings are: > > > in mm/memcontrol.c:163, function obj_cgroup_release(): > > > WARN_ON_ONCE(nr_bytes & (PAGE_SIZE - 1)); > > > > > > in mm/page_counter.c:60, function page_counter_cancel(): > > > if (WARN_ONCE(new < 0, "page_counter underflow: %ld > > > nr_pages=%lu\n", > > > new, nr_pages)) > > > > > > While objcg events should only be accounted for when the entire folio is > > > zswapped, objcg charging should be performed regardlessly. > > > Fix accordingly. > > > > > > After resolving the inconsistency, these warnings disappear. > > > > > > Fixes: b7c0ccdfbafd ("mm: zswap: support large folios in zswap_store()") > > > Cc: stable@xxxxxxxxxxxxxxx > > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> > > > --- > > > > > > v1->v2: > > > > > > Fixed objcg events being accounted for on zswap failure. > > > > > > Fixed the incorrect description. I misunderstood that the base pages are > > > going to be stored in zswap, but their zswap entries are freed > immediately. > > > > > > Added a comment on why it charges pages that are going to be removed > > > from zswap. > > > > > > mm/zswap.c | 14 ++++++++++---- > > > 1 file changed, 10 insertions(+), 4 deletions(-) > > > > > > diff --git a/mm/zswap.c b/mm/zswap.c > > > index 6504174fbc6a..10b30ac46deb 100644 > > > --- a/mm/zswap.c > > > +++ b/mm/zswap.c > > > @@ -1568,20 +1568,26 @@ bool zswap_store(struct folio *folio) > > > > > > bytes = zswap_store_page(page, objcg, pool); > > > if (bytes < 0) > > > - goto put_pool; > > > + goto charge_zswap; > > > compressed_bytes += bytes; > > > } > > > > > > - if (objcg) { > > > - obj_cgroup_charge_zswap(objcg, compressed_bytes); > > > + if (objcg) > > > count_objcg_events(objcg, ZSWPOUT, nr_pages); > > > - } > > > > > > atomic_long_add(nr_pages, &zswap_stored_pages); > > > count_vm_events(ZSWPOUT, nr_pages); > > > > > > ret = true; > > > > > > +charge_zswap: > > > + /* > > > + * Charge zswapped pages even when it failed to zswap the entire > > > folio, > > > + * because zswap_entry_free() will uncharge them anyway. > > > + * Otherwise zswap charging will become inconsistent. > > > + */ > > > + if (objcg) > > > + obj_cgroup_charge_zswap(objcg, compressed_bytes); > > > > Thanks for finding this bug! I am thinking it might make sense to charge > > and increment the zswap_stored_pages counter in zswap_store_page(). > > Something like: > > > > diff --git a/mm/zswap.c b/mm/zswap.c > > index b84c20d889b1..fd2a72598a8a 100644 > > --- a/mm/zswap.c > > +++ b/mm/zswap.c > > @@ -1504,11 +1504,14 @@ static ssize_t zswap_store_page(struct page > *page, > > entry->pool = pool; > > entry->swpentry = page_swpentry; > > entry->objcg = objcg; > > + if (objcg) > > + obj_cgroup_charge_zswap(objcg, entry->length); > > entry->referenced = true; > > if (entry->length) { > > INIT_LIST_HEAD(&entry->lru); > > zswap_lru_add(&zswap_list_lru, entry); > > } > > + atomic_long_inc(&zswap_stored_pages); > > > > return entry->length; > > > > @@ -1526,7 +1529,6 @@ bool zswap_store(struct folio *folio) > > struct obj_cgroup *objcg = NULL; > > struct mem_cgroup *memcg = NULL; > > struct zswap_pool *pool; > > - size_t compressed_bytes = 0; > > bool ret = false; > > long index; > > > > @@ -1569,15 +1571,11 @@ bool zswap_store(struct folio *folio) > > bytes = zswap_store_page(page, objcg, pool); > > if (bytes < 0) > > goto put_pool; > > - compressed_bytes += bytes; > > } > > > > - if (objcg) { > > - obj_cgroup_charge_zswap(objcg, compressed_bytes); > > + if (objcg) > > count_objcg_events(objcg, ZSWPOUT, nr_pages); > > - } > > > > - atomic_long_add(nr_pages, &zswap_stored_pages); > > count_vm_events(ZSWPOUT, nr_pages); > > > > ret = true; > > Hi Sridhar, It looks much clearer! > And we can optimize if it turns out to be worth the complexity. > > May I ask your permission to add your Signed-off-by: and Co-developed-by: ? Yes, please go ahead Hyeonggon. Thanks! Kanchana > I'm afraid to use this without your confirmation due to the > Developer's Certificate of Origin. > > Best, > Hyeonggon > > > > put_pool: > > > zswap_pool_put(pool); > > > put_objcg: > > > -- > > > 2.47.1 > >