Re: [RFC PATCH v1 09/13] mm: zswap: Config variable to enable compress batching in zswap_store().

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 22, 2024 at 7:17 PM Sridhar, Kanchana P
<kanchana.p.sridhar@xxxxxxxxx> wrote:
>
>
> > -----Original Message-----
> > From: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> > Sent: Tuesday, October 22, 2024 5:50 PM
> > To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx>
> > Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> > hannes@xxxxxxxxxxx; nphamcs@xxxxxxxxx; chengming.zhou@xxxxxxxxx;
> > usamaarif642@xxxxxxxxx; ryan.roberts@xxxxxxx; Huang, Ying
> > <ying.huang@xxxxxxxxx>; 21cnbao@xxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx;
> > linux-crypto@xxxxxxxxxxxxxxx; herbert@xxxxxxxxxxxxxxxxxxx;
> > davem@xxxxxxxxxxxxx; clabbe@xxxxxxxxxxxx; ardb@xxxxxxxxxx;
> > ebiggers@xxxxxxxxxx; surenb@xxxxxxxxxx; Accardi, Kristen C
> > <kristen.c.accardi@xxxxxxxxx>; zanussi@xxxxxxxxxx; viro@xxxxxxxxxxxxxxxxxx;
> > brauner@xxxxxxxxxx; jack@xxxxxxx; mcgrof@xxxxxxxxxx; kees@xxxxxxxxxx;
> > joel.granados@xxxxxxxxxx; bfoster@xxxxxxxxxx; willy@xxxxxxxxxxxxx; linux-
> > fsdevel@xxxxxxxxxxxxxxx; Feghali, Wajdi K <wajdi.k.feghali@xxxxxxxxx>; Gopal,
> > Vinodh <vinodh.gopal@xxxxxxxxx>
> > Subject: Re: [RFC PATCH v1 09/13] mm: zswap: Config variable to enable
> > compress batching in zswap_store().
> >
> > On Thu, Oct 17, 2024 at 11:41 PM Kanchana P Sridhar
> > <kanchana.p.sridhar@xxxxxxxxx> wrote:
> > >
> > > Add a new zswap config variable that controls whether zswap_store() will
> > > compress a batch of pages, for instance, the pages in a large folio:
> > >
> > >   CONFIG_ZSWAP_STORE_BATCHING_ENABLED
> > >
> > > The existing CONFIG_CRYPTO_DEV_IAA_CRYPTO variable added in commit
> > > ea7a5cbb4369 ("crypto: iaa - Add Intel IAA Compression Accelerator crypto
> > > driver core") is used to detect if the system has the Intel Analytics
> > > Accelerator (IAA), and the iaa_crypto module is available. If so, the
> > > kernel build will prompt for CONFIG_ZSWAP_STORE_BATCHING_ENABLED.
> > Hence,
> > > users have the ability to set
> > CONFIG_ZSWAP_STORE_BATCHING_ENABLED="y" only
> > > on systems that have Intel IAA.
> > >
> > > If CONFIG_ZSWAP_STORE_BATCHING_ENABLED is enabled, and IAA is
> > configured
> > > as the zswap compressor, zswap_store() will process the pages in a large
> > > folio in batches, i.e., multiple pages at a time. Pages in a batch will be
> > > compressed in parallel in hardware, then stored. On systems without Intel
> > > IAA and/or if zswap uses software compressors, pages in the batch will be
> > > compressed sequentially and stored.
> > >
> > > The patch also implements a zswap API that returns the status of this
> > > config variable.
> >
> > If we are compressing a large folio and batching is an option, is not
> > batching ever the correct thing to do? Why is the config option
> > needed?
>
> Thanks Yosry, for the code review comments! This is a good point. The main
> consideration here was not to impact software compressors run on non-Intel
> platforms, and only incur the memory footprint cost of multiple
> acomp_req/buffers in "struct crypto_acomp_ctx" if there is IAA to reduce
> latency with parallel compressions.
>
> If the memory footprint cost if acceptable, there is no reason not to do
> batching, even if compressions are sequential. We could amortize cost
> of the cgroup charging/objcg/stats updates.

Hmm yeah based on the next patch it seems like we allocate 7 extra
buffers, each sized 2 * PAGE_SIZE, percpu. That's 56KB percpu (with 4K
page size), which is non-trivial.

Making it a config option seems to be inconvenient though. Users have
to sign up for the memory overhead if some of them won't use IAA
batching, or disable batching all together. I would assume this would
be especially annoying for distros, but also for anyone who wants to
experiment with IAA batching.

The first thing that comes to mind is making this a boot option. But I
think we can make it even more convenient and support enabling it at
runtime. We just need to allocate the additional buffers the first
time batching is enabled. This shouldn't be too complicated, we have
an array of buffers on each CPU but we only allocate the first one
initially (unless batching is enabled at boot). When batching is
enabled, we can allocate the remaining buffers.

The only shortcoming of this approach is that if we enable batching
then disable it, we can't free the buffers without significant
complexity, but I think that should be fine. I don't see this being a
common pattern.

WDYT?



>
> Thanks,
> Kanchana
>
> >
> > >
> > > Suggested-by: Ying Huang <ying.huang@xxxxxxxxx>
> > > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@xxxxxxxxx>
> > > ---
> > >  include/linux/zswap.h |  6 ++++++
> > >  mm/Kconfig            | 12 ++++++++++++
> > >  mm/zswap.c            | 14 ++++++++++++++
> > >  3 files changed, 32 insertions(+)
> > >
> > > diff --git a/include/linux/zswap.h b/include/linux/zswap.h
> > > index d961ead91bf1..74ad2a24b309 100644
> > > --- a/include/linux/zswap.h
> > > +++ b/include/linux/zswap.h
> > > @@ -24,6 +24,7 @@ struct zswap_lruvec_state {
> > >         atomic_long_t nr_disk_swapins;
> > >  };
> > >
> > > +bool zswap_store_batching_enabled(void);
> > >  unsigned long zswap_total_pages(void);
> > >  bool zswap_store(struct folio *folio);
> > >  bool zswap_load(struct folio *folio);
> > > @@ -39,6 +40,11 @@ bool zswap_never_enabled(void);
> > >
> > >  struct zswap_lruvec_state {};
> > >
> > > +static inline bool zswap_store_batching_enabled(void)
> > > +{
> > > +       return false;
> > > +}
> > > +
> > >  static inline bool zswap_store(struct folio *folio)
> > >  {
> > >         return false;
> > > diff --git a/mm/Kconfig b/mm/Kconfig
> > > index 33fa51d608dc..26d1a5cee471 100644
> > > --- a/mm/Kconfig
> > > +++ b/mm/Kconfig
> > > @@ -125,6 +125,18 @@ config ZSWAP_COMPRESSOR_DEFAULT
> > >         default "zstd" if ZSWAP_COMPRESSOR_DEFAULT_ZSTD
> > >         default ""
> > >
> > > +config ZSWAP_STORE_BATCHING_ENABLED
> > > +       bool "Batching of zswap stores with Intel IAA"
> > > +       depends on ZSWAP && CRYPTO_DEV_IAA_CRYPTO
> > > +       default n
> > > +       help
> > > +       Enables zswap_store to swapout large folios in batches of 8 pages,
> > > +       rather than a page at a time, if the system has Intel IAA for hardware
> > > +       acceleration of compressions. If IAA is configured as the zswap
> > > +       compressor, this will parallelize batch compression of upto 8 pages
> > > +       in the folio in hardware, thereby improving large folio compression
> > > +       throughput and reducing swapout latency.
> > > +
> > >  choice
> > >         prompt "Default allocator"
> > >         depends on ZSWAP
> > > diff --git a/mm/zswap.c b/mm/zswap.c
> > > index 948c9745ee57..4893302d8c34 100644
> > > --- a/mm/zswap.c
> > > +++ b/mm/zswap.c
> > > @@ -127,6 +127,15 @@ static bool zswap_shrinker_enabled =
> > IS_ENABLED(
> > >                 CONFIG_ZSWAP_SHRINKER_DEFAULT_ON);
> > >  module_param_named(shrinker_enabled, zswap_shrinker_enabled, bool,
> > 0644);
> > >
> > > +/*
> > > + * Enable/disable batching of compressions if zswap_store is called with a
> > > + * large folio. If enabled, and if IAA is the zswap compressor, pages are
> > > + * compressed in parallel in batches of say, 8 pages.
> > > + * If not, every page is compressed sequentially.
> > > + */
> > > +static bool __zswap_store_batching_enabled = IS_ENABLED(
> > > +       CONFIG_ZSWAP_STORE_BATCHING_ENABLED);
> > > +
> > >  bool zswap_is_enabled(void)
> > >  {
> > >         return zswap_enabled;
> > > @@ -241,6 +250,11 @@ static inline struct xarray
> > *swap_zswap_tree(swp_entry_t swp)
> > >         pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name,         \
> > >                  zpool_get_type((p)->zpool))
> > >
> > > +__always_inline bool zswap_store_batching_enabled(void)
> > > +{
> > > +       return __zswap_store_batching_enabled;
> > > +}
> > > +
> > >  /*********************************
> > >  * pool functions
> > >  **********************************/
> > > --
> > > 2.27.0
> > >





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux