Re: [PATCH v4 01/22] mm/zsmalloc: add zpdesc memory descriptor for zswap.zpool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 8/3/24 2:52 AM, Vishal Moola wrote:
> On Mon, Jul 29, 2024 at 07:25:13PM +0800, alexs@xxxxxxxxxx wrote:
>> From: Alex Shi <alexs@xxxxxxxxxx>
> 
> I've been busy with other things, so I haven't been able to review this
> until now. Thanks to both you and Hyeonggon for working on this memdesc :)

Hi Vishal,

Thank a lot for your comments!

My pleasure! :)

> 
>> The 1st patch introduces new memory decriptor zpdesc and rename
>> zspage.first_page to zspage.first_zpdesc, no functional change.
>>
>> We removed PG_owner_priv_1 since it was moved to zspage after
>> commit a41ec880aa7b ("zsmalloc: move huge compressed obj from
>> page to zspage").
>>
>> And keep the memcg_data member, since as Yosry pointed out:
>> "When the pages are freed, put_page() -> folio_put() -> __folio_put() will call
>> mem_cgroup_uncharge(). The latter will call folio_memcg() (which reads
>> folio->memcg_data) to figure out if uncharging needs to be done.
>>
>> There are also other similar code paths that will check
>> folio->memcg_data. It is currently expected to be present for all
>> folios. So until we have custom code paths per-folio type for
>> allocation/freeing/etc, we need to keep folio->memcg_data present and
>> properly initialized."
>>
>> Originally-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>
>> Signed-off-by: Alex Shi <alexs@xxxxxxxxxx>
>> ---
>>  mm/zpdesc.h   | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  mm/zsmalloc.c | 21 ++++++++--------
>>  2 files changed, 76 insertions(+), 11 deletions(-)
>>  create mode 100644 mm/zpdesc.h
>>
>> diff --git a/mm/zpdesc.h b/mm/zpdesc.h
>> new file mode 100644
>> index 000000000000..2dbef231f616
>> --- /dev/null
>> +++ b/mm/zpdesc.h
>> @@ -0,0 +1,66 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* zpdesc.h: zswap.zpool memory descriptor
>> + *
>> + * Written by Alex Shi <alexs@xxxxxxxxxx>
>> + *	      Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>
>> + */
>> +#ifndef __MM_ZPDESC_H__
>> +#define __MM_ZPDESC_H__
>> +
>> +/*
>> + * struct zpdesc -	Memory descriptor for zpool memory, now is for zsmalloc
>> + * @flags:		Page flags, PG_private: identifies the first component page
>> + * @lru:		Indirectly used by page migration
>> + * @mops:		Used by page migration
>> + * @next:		Next zpdesc in a zspage in zsmalloc zpool
>> + * @handle:		For huge zspage in zsmalloc zpool
>> + * @zspage:		Pointer to zspage in zsmalloc
>> + * @memcg_data:		Memory Control Group data.
>> + *
> 
> I think its a good idea to include comments for the padding (namely what
> aliases with it in struct page) here as well. It doesn't hurt, and will
> make them easier to remove in the future.
> 
>> + * This struct overlays struct page for now. Do not modify without a good
>> + * understanding of the issues.
>> + */
>> +struct zpdesc {
>> +	unsigned long flags;
>> +	struct list_head lru;
>> +	struct movable_operations *mops;
>> +	union {
>> +		/* Next zpdescs in a zspage in zsmalloc zpool */
>> +		struct zpdesc *next;
>> +		/* For huge zspage in zsmalloc zpool */
>> +		unsigned long handle;
>> +	};
>> +	struct zspage *zspage;
> 
> I like using pointers here, although I think the comments should be more
> precise about what the purpose of the pointer is. Maybe something like
> "Points to the zspage this zpdesc is a part of" or something.

I will change the comments for this member. Thanks!

> 
>> +	unsigned long _zp_pad_1;
>> +#ifdef CONFIG_MEMCG
>> +	unsigned long memcg_data;
>> +#endif
>> +};
> 
> You should definitely fold your additions to the struct from patch 17
> into this patch. It makes it easier to review, and better for anyone
> looking at the commit log in the future.

Thanks! I will move the struct part from patch 17 here.

> 
>> +#define ZPDESC_MATCH(pg, zp) \
>> +	static_assert(offsetof(struct page, pg) == offsetof(struct zpdesc, zp))
>> +
>> +ZPDESC_MATCH(flags, flags);
>> +ZPDESC_MATCH(lru, lru);
>> +ZPDESC_MATCH(mapping, mops);
>> +ZPDESC_MATCH(index, next);
>> +ZPDESC_MATCH(index, handle);
>> +ZPDESC_MATCH(private, zspage);
>> +#ifdef CONFIG_MEMCG
>> +ZPDESC_MATCH(memcg_data, memcg_data);
>> +#endif
>> +#undef ZPDESC_MATCH
>> +static_assert(sizeof(struct zpdesc) <= sizeof(struct page));
>> +
>> +#define zpdesc_page(zp)			(_Generic((zp),			\
>> +	const struct zpdesc *:		(const struct page *)(zp),	\
>> +	struct zpdesc *:		(struct page *)(zp)))
>> +
>> +#define zpdesc_folio(zp)		(_Generic((zp),			\
>> +	const struct zpdesc *:		(const struct folio *)(zp),	\
>> +	struct zpdesc *:		(struct folio *)(zp)))
>> +
>> +#define page_zpdesc(p)			(_Generic((p),			\
>> +	const struct page *:		(const struct zpdesc *)(p),	\
>> +	struct page *:			(struct zpdesc *)(p)))
>> +
>> +#endif
> 
> I'm don't think we need both page and folio cast functions for zpdescs.
> Sticking to pages will probably suffice (and be easiest) since all APIs
> zsmalloc cares about are already defined. 
> 
> We can stick to 1 "middle-man" descriptor for zpdescs since zsmalloc
> uses those pages as space to track zspages and nothing more. We'll likely
> end up completely removing it from zsmalloc once we can allocate
> memdescs on their own: It seems most (if not all) of the "indirect" members
> of zpdesc are used as indicators to the rest of core-mm telling them not to
> mess with that memory.

Yes, that is also my first attempt to skip folio part, but I found we could got
6.3% object size reduced on zsmalloc.o file, from 37.2KB to 34.9KB, if we use
folio series lock and folio_get/put functions. That saving come from compound_head
check skipping.
So I wrapped them carefully in zpdesc series functions in zpdesc.h file.
They should be easy replaced when we use memdescs in the future. Could we keep them
a while, or ?

Thanks
Alex
 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux