Hi Joel, > -----Original Message----- > From: Joel Granados <joel.granados@xxxxxxxxxx> > Sent: Monday, October 28, 2024 7:42 AM > To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx> > Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; > hannes@xxxxxxxxxxx; yosryahmed@xxxxxxxxxx; nphamcs@xxxxxxxxx; > chengming.zhou@xxxxxxxxx; usamaarif642@xxxxxxxxx; > ryan.roberts@xxxxxxx; Huang, Ying <ying.huang@xxxxxxxxx>; > 21cnbao@xxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; linux- > crypto@xxxxxxxxxxxxxxx; herbert@xxxxxxxxxxxxxxxxxxx; > davem@xxxxxxxxxxxxx; clabbe@xxxxxxxxxxxx; ardb@xxxxxxxxxx; > ebiggers@xxxxxxxxxx; surenb@xxxxxxxxxx; Accardi, Kristen C > <kristen.c.accardi@xxxxxxxxx>; zanussi@xxxxxxxxxx; viro@xxxxxxxxxxxxxxxxxx; > brauner@xxxxxxxxxx; jack@xxxxxxx; mcgrof@xxxxxxxxxx; kees@xxxxxxxxxx; > bfoster@xxxxxxxxxx; willy@xxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx; > Feghali, Wajdi K <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh > <vinodh.gopal@xxxxxxxxx> > Subject: Re: [RFC PATCH v1 13/13] mm: vmscan, swap, zswap: Compress > batching of folios in shrink_folio_list(). > > On Thu, Oct 17, 2024 at 11:41:01PM -0700, Kanchana P Sridhar wrote: > > This patch enables the use of Intel IAA hardware compression acceleration > > to reclaim a batch of folios in shrink_folio_list(). This results in > > reclaim throughput and workload/sys performance improvements. > > > > The earlier patches on compress batching deployed multiple IAA compress > > engines for compressing up to SWAP_CRYPTO_SUB_BATCH_SIZE pages > within a > > large folio that is being stored in zswap_store(). This patch further > > propagates the efficiency improvements demonstrated with IAA "batching > > within folios", to vmscan "batching of folios" which will also use > > batching within folios using the extensible architecture of > > the __zswap_store_batch_core() procedure added earlier, that accepts > > an array of folios. > > ... > > > +static inline void zswap_store_batch(struct swap_in_memory_cache_cb > *simc) > > +{ > > +} > > + > > static inline bool zswap_store(struct folio *folio) > > { > > return false; > > diff --git a/kernel/sysctl.c b/kernel/sysctl.c > > index 79e6cb1d5c48..b8d6b599e9ae 100644 > > --- a/kernel/sysctl.c > > +++ b/kernel/sysctl.c > > @@ -2064,6 +2064,15 @@ static struct ctl_table vm_table[] = { > > .extra1 = SYSCTL_ZERO, > > .extra2 = (void *)&page_cluster_max, > > }, > > + { > > + .procname = "compress-batchsize", > > + .data = &compress_batchsize, > > + .maxlen = sizeof(int), > > + .mode = 0644, > > + .proc_handler = proc_dointvec_minmax, > Why not use proc_douintvec_minmax? These are the reasons I think you > should use that (please correct me if I miss-read your patch): > > 1. Your range is [1,32] -> so no negative values > 2. You are using the value to compare with an unsinged int > (simc->nr_folios) in your `struct swap_in_memory_cache_cb`. So > instead of going from int to uint, you should just do uint all > around. No? > 3. Using proc_douintvec_minmax will automatically error out on negative > input without event considering your range, so there is less code > executed at the end. Thanks for your code review comments! Sure, what you suggest makes sense. Based on Yosry's suggestions, I plan to separate out the batching reclaim shrink_folio_list() changes into a separate series, and focus on just the zswap modifications to support large folio compression batching in the initial series. I will make sure to incorporate your comments in the shrink_folio_list() batching reclaim series. Thanks, Kanchana > > > + .extra1 = SYSCTL_ONE, > > + .extra2 = (void *)&compress_batchsize_max, > > + }, > > { > > .procname = "dirtytime_expire_seconds", > > .data = &dirtytime_expire_interval, > > diff --git a/mm/page_io.c b/mm/page_io.c > > index a28d28b6b3ce..065db25309b8 100644 > > --- a/mm/page_io.c > > +++ b/mm/page_io.c > > @@ -226,6 +226,131 @@ static void swap_zeromap_folio_clear(struct folio > *folio) > > } > > } > > ... > > Best > > -- > > Joel Granados