On Mon, Mar 25, 2024 at 11:50:15PM +0000, Yosry Ahmed wrote: > After the rbtree to xarray conversion, and dropping zswap_entry.refcount > and zswap_entry.value, the only members of zswap_entry utilized by > zero-filled pages are zswap_entry.length (always 0) and > zswap_entry.objcg. Store the objcg pointer directly in the xarray as a > tagged pointer and avoid allocating a zswap_entry completely for > zero-filled pages. > > This simplifies the code as we no longer need to special case > zero-length cases. We are also able to further separate the zero-filled > pages handling logic and completely isolate them within store/load > helpers. Handling tagged xarray pointers is handled in these two > helpers, as well as the newly introduced helper for freeing tree > elements, zswap_tree_free_element(). > > There is also a small performance improvement observed over 50 runs of > kernel build test (kernbench) comparing the mean build time on a skylake > machine when building the kernel in a cgroup v1 container with a 3G > limit. This is on top of the improvement from dropping support for > non-zero same-filled pages: > > base patched % diff > real 69.915 69.757 -0.229% > user 2956.147 2955.244 -0.031% > sys 2594.718 2575.747 -0.731% > > This probably comes from avoiding the zswap_entry allocation and > cleanup/freeing for zero-filled pages. Note that the percentage of > zero-filled pages during this test was only around 1.5% on average. > Practical workloads could have a larger proportion of such pages (e.g. > Johannes observed around 10% [1]), so the performance improvement should > be larger. > > This change also saves a small amount of memory due to less allocated > zswap_entry's. In the kernel build test above, we save around 2M of > slab usage when we swap out 3G to zswap. > > [1]https://lore.kernel.org/linux-mm/20240320210716.GH294822@xxxxxxxxxxx/ > > Signed-off-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx> > --- > mm/zswap.c | 137 ++++++++++++++++++++++++++++++----------------------- > 1 file changed, 78 insertions(+), 59 deletions(-) Tbh, I think this makes the code worse overall. Instead of increasing type safety, it actually reduces it, and places that previously dealt with a struct zswap_entry now deal with a void *. If we go ahead with this series, I think it makes more sense to either a) do the explicit subtyping of struct zswap_entry I had proposed, or b) at least keep the specialness handling of the xarray entry as local as possible, so that instead of a proliferating API that operates on void *, you have explicit filtering only where the tree is accessed, and then work on struct zswap_entry as much as possible.