Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 09, 2023 at 08:48:28PM +0800, Huang, Ying wrote:
> Yosry Ahmed <yosryahmed@xxxxxxxxxx> writes:
> 
> >
> > struct swap_desc {
> >     union { /* Use one bit to distinguish them */
> >         swp_entry_t swap_entry;
> >         struct zswap_entry *zswap_entry;
> >     };
> >     struct folio *swapcache;
> >     atomic_t swap_count;
> >     u32 id;
> > }
> >
> > Having the id in the swap_desc is convenient as we can directly map
> > the swap_desc to a swp_entry_t to place in the page tables, but I
> > don't think it's necessary. Without it, the struct size is 20 bytes,
> > so I think the extra 4 bytes are okay to use anyway if the slab
> > allocator only allocates multiples of 8 bytes.
> >
> > The idea here is to unify the swapcache and swap_count implementation
> > between different swap backends (swapfiles, zswap, etc), which would
> > create a better abstraction and reduce reinventing the wheel.
> >
> > We can reduce to only 8 bytes and only store the swap/zswap entry, but
> > we still need the swap cache anyway so might as well just store the
> > pointer in the struct and have a unified lookup-free swapcache, so
> > really 16 bytes is the minimum.
> >
> > If we stop at 16 bytes, then we need to handle swap count separately
> > in swapfiles and zswap. This is not the end of the world, but are the
> > 8 bytes worth this?
> 
> If my understanding were correct, for current implementation, we need
> one swap cache pointer per swapped out page too.  Even after calling
> __delete_from_swap_cache(), we store the "shadow" entry there.  Although

That is correct. We have the "shadow" entry.

> it's possible to implement shadow entry reclaiming like that for file
> cache shadow entry (workingset_shadow_shrinker), we haven't done that
> yet.  And, it appears that we can live with that.  So, in current
> implementation, for each swapped out page, we use 9 bytes.  If so, the
> memory usage ratio is 24 / 9 = 2.667, still not trivial, but not as
> horrible as 24 / 1 = 24.


The swap_desc proposal did not explicit save the shadow entry in swap_desc.
So the math should be (24 + 8) vs ( 1 + 8). There is about 20 byte extra
per page frame.

> > Instead, if we store a key else in swp_entry_t and use this to lookup
> > the swp_entry_t or zswap_entry pointer then that's essentially what
> > the swap_desc does. It just goes the extra mile of unifying the
> > swapcache as well and storing it directly in the swap_desc instead of
> > storing it in another lookup structure.
> 
> If we choose to make sizeof(struct swap_desc) == 8, that is, store only
> swap_entry in swap_desc.  The added indirection appears to be another
> level of page table with 1 entry.  Then, we may use the similar method
> as supporting system with 2 level and 3 level page tables, like the code
> in include/asm-generic/pgtable-nopmd.h.  But I haven't thought about
> this deeply.

I would like to explore other possibility as well. More idea and discussion is
welcome.

Chris





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux