Hi Nhat, > -----Original Message----- > From: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx> > Sent: Monday, December 2, 2024 4:31 PM > To: Nhat Pham <nphamcs@xxxxxxxxx> > Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; > hannes@xxxxxxxxxxx; yosryahmed@xxxxxxxxxx; > chengming.zhou@xxxxxxxxx; usamaarif642@xxxxxxxxx; > ryan.roberts@xxxxxxx; ying.huang@xxxxxxxxx; 21cnbao@xxxxxxxxx; > akpm@xxxxxxxxxxxxxxxxxxxx; linux-crypto@xxxxxxxxxxxxxxx; > herbert@xxxxxxxxxxxxxxxxxxx; davem@xxxxxxxxxxxxx; > clabbe@xxxxxxxxxxxx; ardb@xxxxxxxxxx; ebiggers@xxxxxxxxxx; > surenb@xxxxxxxxxx; Accardi, Kristen C <kristen.c.accardi@xxxxxxxxx>; > Feghali, Wajdi K <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh > <vinodh.gopal@xxxxxxxxx>; Sridhar, Kanchana P > <kanchana.p.sridhar@xxxxxxxxx> > Subject: RE: [PATCH v4 09/10] mm: zswap: Allocate pool batching resources if > the crypto_alg supports batching. > > Hi Nhat, > > > -----Original Message----- > > From: Nhat Pham <nphamcs@xxxxxxxxx> > > Sent: Monday, December 2, 2024 11:16 AM > > To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx> > > Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; > > hannes@xxxxxxxxxxx; yosryahmed@xxxxxxxxxx; > > chengming.zhou@xxxxxxxxx; usamaarif642@xxxxxxxxx; > > ryan.roberts@xxxxxxx; ying.huang@xxxxxxxxx; 21cnbao@xxxxxxxxx; > > akpm@xxxxxxxxxxxxxxxxxxxx; linux-crypto@xxxxxxxxxxxxxxx; > > herbert@xxxxxxxxxxxxxxxxxxx; davem@xxxxxxxxxxxxx; > > clabbe@xxxxxxxxxxxx; ardb@xxxxxxxxxx; ebiggers@xxxxxxxxxx; > > surenb@xxxxxxxxxx; Accardi, Kristen C <kristen.c.accardi@xxxxxxxxx>; > > Feghali, Wajdi K <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh > > <vinodh.gopal@xxxxxxxxx> > > Subject: Re: [PATCH v4 09/10] mm: zswap: Allocate pool batching resources > if > > the crypto_alg supports batching. > > > > On Fri, Nov 22, 2024 at 11:01 PM Kanchana P Sridhar > > <kanchana.p.sridhar@xxxxxxxxx> wrote: > > > > > > This patch does the following: > > > > > > 1) Modifies the definition of "struct crypto_acomp_ctx" to represent a > > > configurable number of acomp_reqs and buffers. Adds a "nr_reqs" to > > > "struct crypto_acomp_ctx" to contain the nr of resources that will be > > > allocated in the cpu onlining code. > > > > > > 2) The zswap_cpu_comp_prepare() cpu onlining code will detect if the > > > crypto_acomp created for the pool (in other words, the zswap > > compression > > > algorithm) has registered an implementation for batch_compress() and > > > batch_decompress(). If so, it will set "nr_reqs" to > > > SWAP_CRYPTO_BATCH_SIZE and allocate these many reqs/buffers, and > > set > > > the acomp_ctx->nr_reqs accordingly. If the crypto_acomp does not > > support > > > batching, "nr_reqs" defaults to 1. > > > > > > 3) Adds a "bool can_batch" to "struct zswap_pool" that step (2) will set to > > > true if the batching API are present for the crypto_acomp. > > > > Why do we need this "can_batch" field? IIUC, this can be determined > > from the compressor internal fields itself, no? > > > > acomp_has_async_batching(acomp); > > > > Is this just for convenience, or is this actually an expensive thing to > compute? > > Thanks for your comments. This is a good question. I tried not to imply that > batching resources have been allocated for the cpu based only on what > acomp_has_async_batching() returns. It is possible that the cpu onlining > code ran into an -ENOMEM error on any particular cpu. In this case, I set > the pool->can_batch to "false", mainly for convenience, so that zswap > can be somewhat insulated from migration. I agree that this may not be > the best solution; and whether or not batching is enabled can be directly > determined just before the call to crypto_acomp_batch_compress() > based on: > > acomp_ctx->nr_reqs == SWAP_CRYPTO_BATCH_SIZE; > > I currently have a BUG_ON() for this condition not being met, that relies > on the pool->can_batch gating the flow to get to zswap_batch_compress(). > > I think a better solution would be to check for having > SWAP_CRYPTO_BATCH_SIZE > # of acomp_ctx resources right after we acquire the acomp_ctx->mutex and > before > the call to crypto_acomp_batch_compress(). If so, we proceed, and if not, we > call > crypto_acomp_compress(). It seems this might be the only way to know for > sure > whether the crypto batching API can be called, given that migration is possible > at any point in zswap_store(). Once we have obtained the mutex_lock, it > seems > we can proceed with batching based on this check (although the UAF situation > remains as a larger issue, beyond the scope of this patch). I would appreciate > other ideas as well. > > Also, I have submitted a patch-series [1] with Yosry's & Johannes' suggestions > to this series. This is setting up a consolidated > zswap_store()/zswap_store_pages() > code path for batching and non-batching compressors. My goal is for [1] to > go through code reviews and be able to transition to batching, with a simple > check: > > if (acomp_ctx->nr_reqs == SWAP_CRYPTO_BATCH_SIZE) > zswap_batch_compress(); > else > zswap_compress(); > > Please feel free to provide code review comments in [1]. Thanks! > > [1]: https://patchwork.kernel.org/project/linux-mm/list/?series=912937 > > > > > > > > > SWAP_CRYPTO_BATCH_SIZE is set to 8, which will be the IAA compress > > batching > > > > I like a sane default value as much as the next guy, but this seems a > > bit odd to me: > > > > 1. The placement of this constant/default value seems strange to me. > > This is a compressor-specific value no? Why are we enforcing this > > batching size at the zswap level, and uniformly at that? What if we > > introduce a new batch compression algorithm...? Or am I missing > > something, and this is a sane default for other compressors too? > > You bring up an excellent point. This is a compressor-specific value. > Instead of setting this up as a constant, which as you correctly observe, > may not make sense for a non-IAA compressor, one way to get > this could be by querying the compressor, say: > > int acomp_get_max_batchsize(struct crypto_acomp *tfm) {...}; > > to then allocate sufficient acomp_reqs/buffers/etc. in the zswap > cpu onlining code. > > > > > 2. Why is this value set to 8? Experimentation? Could you add some > > justification in documentation? > > Can I get back to you later this week with a proposal for this? We plan > to have a team discussion on how best to approach this for current > and future hardware. Sorry it took me quite a while to get back to you on this. I have been busy with implementing request chaining, and other major improvements to this series based on the comments received thus far. I will be submitting a v5 of this series shortly, in which I have implemented an IAA_CRYPTO_MAX_BATCH_SIZE in the iaa_crypto driver. For now I set this to 8 since we have done all our testing with a batch size of 8, but we are still running experiments to figure this out, hence this #define in the iaa_crypto driver (in v5) can potentially change. Further, there is a zswap-specific ZSWAP_MAX_BATCH_SIZE in v5, which is also 8. I would appreciate code review comments for v5. If the approach I've taken in v5 is acceptable, I will add more details/justification in the documentation in a v6. Thanks, Kanchana > > Thanks, > Kanchana