On 8/3/24 2:52 AM, Vishal Moola wrote: > On Mon, Jul 29, 2024 at 07:25:13PM +0800, alexs@xxxxxxxxxx wrote: >> From: Alex Shi <alexs@xxxxxxxxxx> > > I've been busy with other things, so I haven't been able to review this > until now. Thanks to both you and Hyeonggon for working on this memdesc :) Hi Vishal, Thank a lot for your comments! My pleasure! :) > >> The 1st patch introduces new memory decriptor zpdesc and rename >> zspage.first_page to zspage.first_zpdesc, no functional change. >> >> We removed PG_owner_priv_1 since it was moved to zspage after >> commit a41ec880aa7b ("zsmalloc: move huge compressed obj from >> page to zspage"). >> >> And keep the memcg_data member, since as Yosry pointed out: >> "When the pages are freed, put_page() -> folio_put() -> __folio_put() will call >> mem_cgroup_uncharge(). The latter will call folio_memcg() (which reads >> folio->memcg_data) to figure out if uncharging needs to be done. >> >> There are also other similar code paths that will check >> folio->memcg_data. It is currently expected to be present for all >> folios. So until we have custom code paths per-folio type for >> allocation/freeing/etc, we need to keep folio->memcg_data present and >> properly initialized." >> >> Originally-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> >> Signed-off-by: Alex Shi <alexs@xxxxxxxxxx> >> --- >> mm/zpdesc.h | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++ >> mm/zsmalloc.c | 21 ++++++++-------- >> 2 files changed, 76 insertions(+), 11 deletions(-) >> create mode 100644 mm/zpdesc.h >> >> diff --git a/mm/zpdesc.h b/mm/zpdesc.h >> new file mode 100644 >> index 000000000000..2dbef231f616 >> --- /dev/null >> +++ b/mm/zpdesc.h >> @@ -0,0 +1,66 @@ >> +/* SPDX-License-Identifier: GPL-2.0 */ >> +/* zpdesc.h: zswap.zpool memory descriptor >> + * >> + * Written by Alex Shi <alexs@xxxxxxxxxx> >> + * Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> >> + */ >> +#ifndef __MM_ZPDESC_H__ >> +#define __MM_ZPDESC_H__ >> + >> +/* >> + * struct zpdesc - Memory descriptor for zpool memory, now is for zsmalloc >> + * @flags: Page flags, PG_private: identifies the first component page >> + * @lru: Indirectly used by page migration >> + * @mops: Used by page migration >> + * @next: Next zpdesc in a zspage in zsmalloc zpool >> + * @handle: For huge zspage in zsmalloc zpool >> + * @zspage: Pointer to zspage in zsmalloc >> + * @memcg_data: Memory Control Group data. >> + * > > I think its a good idea to include comments for the padding (namely what > aliases with it in struct page) here as well. It doesn't hurt, and will > make them easier to remove in the future. > >> + * This struct overlays struct page for now. Do not modify without a good >> + * understanding of the issues. >> + */ >> +struct zpdesc { >> + unsigned long flags; >> + struct list_head lru; >> + struct movable_operations *mops; >> + union { >> + /* Next zpdescs in a zspage in zsmalloc zpool */ >> + struct zpdesc *next; >> + /* For huge zspage in zsmalloc zpool */ >> + unsigned long handle; >> + }; >> + struct zspage *zspage; > > I like using pointers here, although I think the comments should be more > precise about what the purpose of the pointer is. Maybe something like > "Points to the zspage this zpdesc is a part of" or something. I will change the comments for this member. Thanks! > >> + unsigned long _zp_pad_1; >> +#ifdef CONFIG_MEMCG >> + unsigned long memcg_data; >> +#endif >> +}; > > You should definitely fold your additions to the struct from patch 17 > into this patch. It makes it easier to review, and better for anyone > looking at the commit log in the future. Thanks! I will move the struct part from patch 17 here. > >> +#define ZPDESC_MATCH(pg, zp) \ >> + static_assert(offsetof(struct page, pg) == offsetof(struct zpdesc, zp)) >> + >> +ZPDESC_MATCH(flags, flags); >> +ZPDESC_MATCH(lru, lru); >> +ZPDESC_MATCH(mapping, mops); >> +ZPDESC_MATCH(index, next); >> +ZPDESC_MATCH(index, handle); >> +ZPDESC_MATCH(private, zspage); >> +#ifdef CONFIG_MEMCG >> +ZPDESC_MATCH(memcg_data, memcg_data); >> +#endif >> +#undef ZPDESC_MATCH >> +static_assert(sizeof(struct zpdesc) <= sizeof(struct page)); >> + >> +#define zpdesc_page(zp) (_Generic((zp), \ >> + const struct zpdesc *: (const struct page *)(zp), \ >> + struct zpdesc *: (struct page *)(zp))) >> + >> +#define zpdesc_folio(zp) (_Generic((zp), \ >> + const struct zpdesc *: (const struct folio *)(zp), \ >> + struct zpdesc *: (struct folio *)(zp))) >> + >> +#define page_zpdesc(p) (_Generic((p), \ >> + const struct page *: (const struct zpdesc *)(p), \ >> + struct page *: (struct zpdesc *)(p))) >> + >> +#endif > > I'm don't think we need both page and folio cast functions for zpdescs. > Sticking to pages will probably suffice (and be easiest) since all APIs > zsmalloc cares about are already defined. > > We can stick to 1 "middle-man" descriptor for zpdescs since zsmalloc > uses those pages as space to track zspages and nothing more. We'll likely > end up completely removing it from zsmalloc once we can allocate > memdescs on their own: It seems most (if not all) of the "indirect" members > of zpdesc are used as indicators to the rest of core-mm telling them not to > mess with that memory. Yes, that is also my first attempt to skip folio part, but I found we could got 6.3% object size reduced on zsmalloc.o file, from 37.2KB to 34.9KB, if we use folio series lock and folio_get/put functions. That saving come from compound_head check skipping. So I wrapped them carefully in zpdesc series functions in zpdesc.h file. They should be easy replaced when we use memdescs in the future. Could we keep them a while, or ? Thanks Alex