RE: [PATCH v7 6/8] mm: zswap: Support mTHP swapout in zswap_store().

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> Sent: Tuesday, September 24, 2024 2:34 PM
> To: Nhat Pham <nphamcs@xxxxxxxxx>
> Cc: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx>; linux-
> kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; hannes@xxxxxxxxxxx;
> chengming.zhou@xxxxxxxxx; usamaarif642@xxxxxxxxx;
> shakeel.butt@xxxxxxxxx; ryan.roberts@xxxxxxx; Huang, Ying
> <ying.huang@xxxxxxxxx>; 21cnbao@xxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx;
> Zou, Nanhai <nanhai.zou@xxxxxxxxx>; Feghali, Wajdi K
> <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh <vinodh.gopal@xxxxxxxxx>
> Subject: Re: [PATCH v7 6/8] mm: zswap: Support mTHP swapout in
> zswap_store().
> 
> On Tue, Sep 24, 2024 at 2:08 PM Nhat Pham <nphamcs@xxxxxxxxx> wrote:
> >
> > On Tue, Sep 24, 2024 at 1:51 PM Sridhar, Kanchana P
> > <kanchana.p.sridhar@xxxxxxxxx> wrote:
> > >
> > >
> > > This is an excellent point. Thanks Nhat for catching this! I can see two
> > > options to solving this:
> > >
> > > Option 1: If zswap_mthp_enabled is "false", delete all stored offsets
> > > for the mTHP in zswap before exiting. This could race with writeback
> > > (either one or more subpages could be written back before zswap_store
> > > acquires the tree lock), however, I don't think it will cause data
> inconsistencies.
> > > Any offsets for subpages not written back will be deleted from zswap,
> > > zswap_store() will return false, and the backing swap device's subsequent
> > > swapout will over-write the zswap write-back data. Could anything go
> wrong
> > > with this?
> >
> > I think this should be safe, albeit a bit awkward.
> >
> > At this point (zswap_store()), we should have the folio added to to
> > swap cache, and locked. All the associated swap entries will point to
> > this same (large) folio.
> >
> > Any concurrent zswap writeback attempt, even on a tail page, should
> > get that folio when it calls __read_swap_cache_async(), and with
> > page_allocated == false, and should short circuit.
> >
> > So I don't think we will race with zswap_writeback().
> >
> > Yosry, Chengming, Johannes, any thoughts?
> 
> Why can't we just handle it the same way as we handle zswap
> disablement? If it is disabled, we invalidate any old entries for the
> offsets and return false to swapout to disk.
> 
> Taking a step back, why do we need the runtime knob and config option?
> Are there cases where we think zswapout of mTHPs will perform badly,
> or is it just due to lack of confidence in the feature?

Thanks Nhat and Yosry for the suggestions/comments.

If I recall correctly, the topic of adding a config option/knob came up
based on earlier data I had collected with a zram backing device setup,
which showed a performance degradation with zstd, but not with deflate-iaa.

Since the v7 data collected with an 823G SSD swap disk partition indicates
that we get good throughput and latency improvements with zswap-mTHP
with zstd and deflate-iaa, I am not sure if the knob is still required (if this
is representative of most of the setups that use mTHP).

I am confident about the zswap-mTHP feature itself, and don’t think the
knob is needed from that perspective. I think the question is really about
having the ability to disable zswap-mTHP in some existing setup where
having mTHP enabled performs worse with this patchset than without.

I am Ok with having the knob and handling it using Option 1, or, not
having a knob.

Thanks,
Kanchana 

> 
> >
> > >
> > > Option 2: Only provide a build config option,
> > > CONFIG_ZSWAP_STORE_THP_DEFAULT_ON, that cannot be dynamically
> changed.
> >
> > This can be a last resort thing, if the above doesn't work. Not the
> > end of the world, but not ideal :)
> >
> > >
> > > Would appreciate suggestions on these, and other potential solutions.
> > >
> > > Thanks,
> > > Kanchana




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux