> -----Original Message----- > From: Yosry Ahmed <yosryahmed@xxxxxxxxxx> > Sent: Wednesday, September 25, 2024 2:06 PM > To: Johannes Weiner <hannes@xxxxxxxxxxx> > Cc: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx>; linux- > kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; nphamcs@xxxxxxxxx; > chengming.zhou@xxxxxxxxx; usamaarif642@xxxxxxxxx; > shakeel.butt@xxxxxxxxx; ryan.roberts@xxxxxxx; Huang, Ying > <ying.huang@xxxxxxxxx>; 21cnbao@xxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; > Zou, Nanhai <nanhai.zou@xxxxxxxxx>; Feghali, Wajdi K > <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh <vinodh.gopal@xxxxxxxxx> > Subject: Re: [PATCH v7 6/8] mm: zswap: Support mTHP swapout in > zswap_store(). > > On Wed, Sep 25, 2024 at 1:13 PM Johannes Weiner <hannes@xxxxxxxxxxx> > wrote: > > > > On Wed, Sep 25, 2024 at 12:39:02PM -0700, Yosry Ahmed wrote: > > > On Wed, Sep 25, 2024 at 12:20 PM Johannes Weiner > <hannes@xxxxxxxxxxx> wrote: > > > > > > > > On Wed, Sep 25, 2024 at 11:30:34AM -0700, Yosry Ahmed wrote: > > > > > Johannes wrote: > > > > > > If this ever becomes an issue, we can handle it in a fastpath- > slowpath > > > > > > scheme: check the limit up front for fast-path failure if we're > > > > > > already maxed out, just like now; then make > obj_cgroup_charge_zswap() > > > > > > atomically charge against zswap.max and unwind the store if we > raced. > > > > > > > > > > > > For now, I would just keep the simple version we currently have: > check > > > > > > once in zswap_store() and then just go ahead for the whole folio. > > > > > > > > > > I am not totally against this but I feel like this is too optimistic. > > > > > I think we can keep it simple-ish by maintaining an ewma for the > > > > > compression ratio, we already have primitives for this (see > > > > > DECLARE_EWMA). > > > > > > > > > > Then in zswap_store(), we can use the ewma to estimate the > compressed > > > > > size and use it to do the memcg and global limit checks once, like we > > > > > do today. Instead of just checking if we are below the limits, we > > > > > check if we have enough headroom for the estimated compressed size. > > > > > Then we call zswap_store_page() to do the per-page stuff, then do > > > > > batched charging and stats updates. > > > > > > > > I'm not sure what you gain from making a non-atomic check precise. You > > > > can get a hundred threads determining down precisely that *their* > > > > store will fit exactly into the last 800kB before the limit. > > > > > > We just get to avoid overshooting in cases where we know we probably > > > can't fit it anyway. If we have 4KB left and we are trying to compress > > > a 2MB THP, for example. It just makes the upfront check to avoid > > > pointless compression a little bit more meaningful. > > > > I think I'm missing something. It's not just an upfront check, it's > > the only check. The charge down the line doesn't limit anything, it > > just counts. So if this check passes, we WILL store the folio. There > > is no pointless compression. > > I got confused by what you said about the fast-slow path, I thought > you were suggesting we do this now, so I was saying it's better to use > an estimate of the compressed size in the fast path to avoid pointless > compression. > > I missed the second paragraph. > > > > > We might overshoot the limit by about one folio in a single-threaded > > scenario. But that is negligible in comparison to the overshoot we can > > get due to race conditions. > > > > Again, I see no no practical, meaningful difference in outcome by > > making that limit check any more precise. Just keep it as-is. > > > Sorry to be blunt, but "precision" in a non-atomic check like this? > > makes no sense. The fact that it's not too expensive is irrelevant. > > This discussion around this honestly has gone off the rails. > > Yeah I thought we were talking about the version where we rollback > compressions if we overshoot, my bad. We discussed quite a few things > and I managed to confuse myself. > > > Just leave the limit checks exactly as they are. Check limits and > > cgroup_may_zswap() once up front. Compress the subpages. Acquire > > references and bump all stats in batches of folio_nr_pages(). You can > > add up the subpage compressed bytes in the for-loop and do the > > obj_cgroup_charge_zswap() in a single call at the end as well. > > We can keep the limit checks as they are for now, and revisit as needed. Thanks Johannes and Yosry for the discussion! I will proceed as suggested. Thanks, Kanchana