On Thu, Mar 09, 2023 at 08:48:28PM +0800, Huang, Ying wrote: > Yosry Ahmed <yosryahmed@xxxxxxxxxx> writes: > > > > > struct swap_desc { > > union { /* Use one bit to distinguish them */ > > swp_entry_t swap_entry; > > struct zswap_entry *zswap_entry; > > }; > > struct folio *swapcache; > > atomic_t swap_count; > > u32 id; > > } > > > > Having the id in the swap_desc is convenient as we can directly map > > the swap_desc to a swp_entry_t to place in the page tables, but I > > don't think it's necessary. Without it, the struct size is 20 bytes, > > so I think the extra 4 bytes are okay to use anyway if the slab > > allocator only allocates multiples of 8 bytes. > > > > The idea here is to unify the swapcache and swap_count implementation > > between different swap backends (swapfiles, zswap, etc), which would > > create a better abstraction and reduce reinventing the wheel. > > > > We can reduce to only 8 bytes and only store the swap/zswap entry, but > > we still need the swap cache anyway so might as well just store the > > pointer in the struct and have a unified lookup-free swapcache, so > > really 16 bytes is the minimum. > > > > If we stop at 16 bytes, then we need to handle swap count separately > > in swapfiles and zswap. This is not the end of the world, but are the > > 8 bytes worth this? > > If my understanding were correct, for current implementation, we need > one swap cache pointer per swapped out page too. Even after calling > __delete_from_swap_cache(), we store the "shadow" entry there. Although That is correct. We have the "shadow" entry. > it's possible to implement shadow entry reclaiming like that for file > cache shadow entry (workingset_shadow_shrinker), we haven't done that > yet. And, it appears that we can live with that. So, in current > implementation, for each swapped out page, we use 9 bytes. If so, the > memory usage ratio is 24 / 9 = 2.667, still not trivial, but not as > horrible as 24 / 1 = 24. The swap_desc proposal did not explicit save the shadow entry in swap_desc. So the math should be (24 + 8) vs ( 1 + 8). There is about 20 byte extra per page frame. > > Instead, if we store a key else in swp_entry_t and use this to lookup > > the swp_entry_t or zswap_entry pointer then that's essentially what > > the swap_desc does. It just goes the extra mile of unifying the > > swapcache as well and storing it directly in the swap_desc instead of > > storing it in another lookup structure. > > If we choose to make sizeof(struct swap_desc) == 8, that is, store only > swap_entry in swap_desc. The added indirection appears to be another > level of page table with 1 entry. Then, we may use the similar method > as supporting system with 2 level and 3 level page tables, like the code > in include/asm-generic/pgtable-nopmd.h. But I haven't thought about > this deeply. I would like to explore other possibility as well. More idea and discussion is welcome. Chris