On 2024/2/3 06:44, Nhat Pham wrote: > On Fri, Feb 2, 2024 at 2:37 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: >> >> On Fri, Feb 2, 2024 at 2:33 PM Nhat Pham <nphamcs@xxxxxxxxx> wrote: >>> >>> On Thu, Feb 1, 2024 at 7:50 AM Chengming Zhou >>> <zhouchengming@xxxxxxxxxxxxx> wrote: >>>> >>>> Since we don't need to leave zswap entry on the zswap tree anymore, >>>> we should remove it from tree once we find it from the tree. >>>> >>>> Then after using it, we can directly free it, no concurrent path >>>> can find it from tree. Only the shrinker can see it from lru list, >>>> which will also double check under tree lock, so no race problem. >>>> >>>> So we don't need refcount in zswap entry anymore and don't need to >>>> take the spinlock for the second time to invalidate it. >>>> >>>> The side effect is that zswap_entry_free() maybe not happen in tree >>>> spinlock, but it's ok since nothing need to be protected by the lock. >>>> >>>> Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> >>> >>> Oh this is sweet! Fewer things to keep in mind. >>> Reviewed-by: Nhat Pham <nphamcs@xxxxxxxxx> >>> >>>> --- >>>> mm/zswap.c | 63 +++++++++++--------------------------------------------------- >>>> 1 file changed, 11 insertions(+), 52 deletions(-) >>>> >>>> diff --git a/mm/zswap.c b/mm/zswap.c >>>> index cbf379abb6c7..cd67f7f6b302 100644 >>>> --- a/mm/zswap.c >>>> +++ b/mm/zswap.c >>>> @@ -193,12 +193,6 @@ struct zswap_pool { >>>> * >>>> * rbnode - links the entry into red-black tree for the appropriate swap type >>>> * swpentry - associated swap entry, the offset indexes into the red-black tree >>>> - * refcount - the number of outstanding reference to the entry. This is needed >>>> - * to protect against premature freeing of the entry by code >>>> - * concurrent calls to load, invalidate, and writeback. The lock >>>> - * for the zswap_tree structure that contains the entry must >>>> - * be held while changing the refcount. Since the lock must >>>> - * be held, there is no reason to also make refcount atomic. >>>> * length - the length in bytes of the compressed page data. Needed during >>>> * decompression. For a same value filled page length is 0, and both >>>> * pool and lru are invalid and must be ignored. >>>> @@ -211,7 +205,6 @@ struct zswap_pool { >>>> struct zswap_entry { >>>> struct rb_node rbnode; >>>> swp_entry_t swpentry; >>>> - int refcount; >>> >>> Hah this should even make zswap a bit more space-efficient. IIRC Yosry >>> has some analysis regarding how much less efficient zswap will be >>> every time we add a new field to zswap entry - this should go in the >>> opposite direction :) >> >> Unfortunately in this specific case I think it won't change the size >> of the allocation for struct zswap_entry anyway, but it is a step >> nonetheless :) > > Ah, is it because of the field alignment requirement? But yeah, one > day we will remove enough of them :) Yeah, the zswap_entry size is not changed :) If later xarray conversion land in, the rb_node would be gone, so one cacheline will be enough. struct zswap_entry { struct rb_node rbnode __attribute__((__aligned__(8))); /* 0 24 */ swp_entry_t swpentry; /* 24 8 */ unsigned int length; /* 32 4 */ /* XXX 4 bytes hole, try to pack */ struct zswap_pool * pool; /* 40 8 */ union { long unsigned int handle; /* 48 8 */ long unsigned int value; /* 48 8 */ }; /* 48 8 */ struct obj_cgroup * objcg; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct list_head lru; /* 64 16 */ /* size: 80, cachelines: 2, members: 7 */ /* sum members: 76, holes: 1, sum holes: 4 */ /* forced alignments: 1 */ /* last cacheline: 16 bytes */ } __attribute__((__aligned__(8)));