On Mon, Nov 25, 2024 at 12:20:01PM -0800, Yosry Ahmed wrote: > On Fri, Nov 22, 2024 at 11:01 PM Kanchana P Sridhar > <kanchana.p.sridhar@xxxxxxxxx> wrote: > > > > This patch adds two new zswap API: > > > > 1) bool zswap_can_batch(void); > > 2) void zswap_batch_store(struct folio_batch *batch, int *errors); > > > > Higher level mm code, for instance, swap_writepage(), can query if the > > current zswap pool supports batching, by calling zswap_can_batch(). If so > > it can invoke zswap_batch_store() to swapout a large folio much more > > efficiently to zswap, instead of calling zswap_store(). > > > > Hence, on systems with Intel IAA hardware compress/decompress accelerators, > > swap_writepage() will invoke zswap_batch_store() for large folios. > > > > zswap_batch_store() will call crypto_acomp_batch_compress() to compress up > > to SWAP_CRYPTO_BATCH_SIZE (i.e. 8) pages in large folios in parallel using > > the multiple compress engines available in IAA. > > > > On platforms with multiple IAA devices per package, compress jobs from all > > cores in a package will be distributed among all IAA devices in the package > > by the iaa_crypto driver. > > > > The newly added zswap_batch_store() follows the general structure of > > zswap_store(). Some amount of restructuring and optimization is done to > > minimize failure points for a batch, fail early and maximize the zswap > > store pipeline occupancy with SWAP_CRYPTO_BATCH_SIZE pages, potentially > > from multiple folios in future. This is intended to maximize reclaim > > throughput with the IAA hardware parallel compressions. > > > > Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx> > > Suggested-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx> > > This is definitely not what I suggested :) > > I won't speak for Johannes here but I suspect it's not quite what he > wanted either. It is not. I suggested having an integrated code path where "legacy" stores of single pages is just the batch_size=1 case. https://lore.kernel.org/linux-mm/20241107185340.GG1172372@xxxxxxxxxxx/ > What we really need to do (and I suppose what Johannes meant, but > please correct me if I am wrong), is to make the existing flow work > with batches. > > For example, most of zswap_store() should remain the same. It is still > getting a folio to compress, the only difference is that we will > parallelize the page compressions. zswap_store_page() is where some > changes need to be made. Instead of a single function that handles the > storage of each page, we need a vectorized function that handles the > storage of N pages in a folio (allocate zswap_entry's, do xarray > insertions, etc). This should be refactoring in a separate patch. > > Once we have that, the logic introduced by this patch should really be > mostly limited to zswap_compress(), where the acomp interfacing would > be different based on whether batching is supported or not. This could > be changes in zswap_compress() itself, or maybe at this point we can > have a completely different path (e.g. zswap_compress_batch()). But > outside of that, I don't see why we should have a completely different > store path for the batching. +1