On Mon, Nov 21, 2022 at 10:00:16PM +0300, Alexey Romanov wrote: > Hello! > > This RFC series adds feature which allows merge identical > compressed pages into a single one. The main idea is that > zram only stores object references, which store the compressed > content of the pages. Thus, the contents of the zsmalloc objects > don't change in any way. > > For simplicity, let's imagine that 3 pages with the same content > got into zram: > > +----------------+ +----------------+ +----------------+ > |zram_table_entry| |zram_table_entry| |zram_table_entry| > +-------+--------+ +-------+--------+ +--------+-------+ > | | | > | handle (1) | handle (2) | handle (3) > +-------v--------+ +-------v---------+ +--------v-------+ > |zsmalloc object| |zsmalloc object | |zsmalloc object| > ++--------------++ +-+-------------+-+ ++--------------++ > +--------------+ +-------------+ +--------------+ > | buffer: "abc"| |buffer: "abc"| | buffer: "abc"| > +--------------+ +-------------+ +--------------+ > > As you can see, the data is duplicated. Merge mechanism saves > (after scanning objects) only one zsmalloc object. Here's > what happens ater the scan and merge: > > +----------------+ +----------------+ +----------------+ > |zram_table_entry| |zram_table_entry| |zram_tabl _entry| > +-------+--------+ +-------+--------+ +--------+-------+ > | | | > | handle (1) | handle (1) | handle (1) > | +--------v---------+ | > +-----------> zsmalloc object <-----------+ > +--+-------------+-+ > +-------------+ > |buffer: "abc"| > +-------------+ > > Thus, we reduced the amount of memory occupied by 3 times. > > This mechanism doesn't affect the perf of the zram itself in > any way (maybe just a little bit on the zram_free_page function). > In order to describe each such identical object, we (constantly) > need sizeof(zram_rbtree_node) bytes. So, for example, if the system > has 20 identical buffers with a size of 1024, the memory gain will be > (20 * 1024) - (1 * 1024 + sizeof(zram_rbtree_node)) = 19456 - > sizeof(zram_rbtree_node) bytes. But, it should be understood, these are > counts without zsmalloc data structures overhead. > > Testing on my system (8GB ram + 1 gb zram swap) showed that at high > loads, on average, when calling the merge mechanism, we can save > up to 15-20% of the memory usage. This looks pretty great. However, I'm curious why it's specific to zram, and not part of zsmalloc? That way zswap would benefit as well, without having to duplicate the implementation. This happened for example with page_same_filled() and zswap_is_page_same_filled(). It's zsmalloc's job to store content efficiently, so couldn't this feature (just like the page_same_filled one) be an optimization that zsmalloc does transparently for all its users? > This patch serices adds a new sysfs node (trigger merging) and new > field in mm_stat (how many pages are merged in zram at the moment): > > $ cat /sys/block/zram/mm_stat > 431452160 332984392 339894272 0 339894272 282 0 51374 51374 0 > > $ echo 1 > /sys/block/zram/merge > > $ cat /sys/block/zram/mm_stat > 431452160 270376848 287301504 0 339894272 282 0 51374 51374 6593 The optimal frequency for calling this is probably tied to prevalent memory pressure, which is somewhat tricky to do from userspace. Would it make sense to hook this up to a shrinker?