On Wed, Mar 20, 2024 at 2:19 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > On Wed, Mar 20, 2024 at 05:07:21PM -0400, Johannes Weiner wrote: > > On Wed, Mar 20, 2024 at 01:49:17PM -0700, Yosry Ahmed wrote: > > > Hey folks, > > > > > > I was looking at cleaning up the same-filled handling code in zswap, > > > when it hit me that after the xarray conversion, the only member of > > > struct zwap_entry that is relevant to same-filled pages is now the > > > objcg pointer. > > > > > > The xarray allows a pointer to be tagged by up to two tags (1 and 3), > > > so we can completely avoid allocating a zswap_entry for same-filled > > > pages by storing a tagged objcg pointer directly in the xarray > > > instead. > > > > > > Basically the xarray would then either have a pointer to struct > > > zswap_entry or struct obj_cgroup, where the latter is tagged as > > > SAME_FILLED_ONE or SAME_FILLED_ZERO. > > > > > > There are two benefits of this: > > > - Saving some memory (precisely 64 bytes per same-filled entry). > > > - Further separating handling of same-filled pages from compressed > > > pages, which results in some nice cleanups (especially in > > > zswap_store()). It also makes further improvements easier (e.g. > > > skipping limit checking for same-filled entries). > > > > This sounds interesting. > > > > Where would you store the byte value it's filled with? Or would you > > limit it to zero-filled only? > > The dumb thing about objcg is that for same-filled entries we really > only need it for bumping ZSWPIN. Nothing else. entry->length is 0 for > them, so even though we call the charge function, it doesn't actually > do anything. > > Loading them is cheap and doesn't involve decompression. An argument > could be made to exclude them from ZSWPOUT and ZSWPIN entirely. > > Or cheat a little and bump ZSWPIN for current->objcg instead - > probably good enough to make excessive thrashing discoverable by the > workload that's directly affected. > > Then you could get rid of the objcg pointer and use the xarray slot > for whatever else you'd want. Yeah it's only useful for the stats. Using current->objcg would work, and should be ultimately pointing to the same memcg in *most* cases, I assume. We still wouldn't be able to store a full word as we do today, because the xarray needs 1 bit for its own usage. So the same-filled implementation would still need to change from repeated words (8 bytes) to something smaller -- or we can just allocate a separate struct for same-filled pages.