Re: [RFC] Storing same-filled pages without a zswap_entry

Johannes Weiner <hannes@xxxxxxxxxxx> · Wed, 20 Mar 2024 17:07:16 -0400

On Wed, Mar 20, 2024 at 01:49:17PM -0700, Yosry Ahmed wrote:
> Hey folks,
> 
> I was looking at cleaning up the same-filled handling code in zswap,
> when it hit me that after the xarray conversion, the only member of
> struct zwap_entry that is relevant to same-filled pages is now the
> objcg pointer.
> 
> The xarray allows a pointer to be tagged by up to two tags (1 and 3),
> so we can completely avoid allocating a zswap_entry for same-filled
> pages by storing a tagged objcg pointer directly in the xarray
> instead.
> 
> Basically the xarray would then either have a pointer to struct
> zswap_entry or struct obj_cgroup, where the latter is tagged as
> SAME_FILLED_ONE or SAME_FILLED_ZERO.
> 
> There are two benefits of this:
> - Saving some memory (precisely 64 bytes per same-filled entry).
> - Further separating handling of same-filled pages from compressed
> pages, which results in some nice cleanups (especially in
> zswap_store()). It also makes further improvements easier (e.g.
> skipping limit checking for same-filled entries).

This sounds interesting.

Where would you store the byte value it's filled with? Or would you
limit it to zero-filled only?

> The disadvantage is obviously the complexity needed to handle two
> different types of pointers in the xarray, although I think with the
> correct abstractions this is not a big deal.
> 
> I have some untested patches that implement this that I plan on
> testing and sending out at some point, the reason I am sending this
> RFC now is to gauge interest. I am not sure how common same-filled
> pages are. Unfortunately, this data is not easy to collect from our
> fleet (still working on it), so if someone has data from actual
> workloads that would be helpful.
> 
> Running the kernel build test only shows a small amount of same-filled
> pages landing in zswap, but I am thinking maybe actual workloads have
> more zerod pages lying around.

In our fleet, same-filled pages seem to average pretty consistently at
10% of total stored entries.

I'd assume they're mostly zero-filled instead of other patterns, but I
don't have any data on that.

The savings from the entry would be a few megabytes per host, probably
not an argument by itself. But the code simplifications and shaving a
few cycles of the fast paths do sound promising.