Re: [PATCH v9] zswap: replace RB tree with xarray

Yosry Ahmed <yosryahmed@xxxxxxxxxx> · Tue, 26 Mar 2024 11:48:57 -0700

On Tue, Mar 26, 2024 at 11:42 AM Chris Li <chrisl@xxxxxxxxxx> wrote:
>
> On Tue, Mar 26, 2024 at 11:35 AM Chris Li <chrisl@xxxxxxxxxx> wrote:
> >
> > Very deep RB tree requires rebalance at times.  That contributes to the
> > zswap fault latencies.  Xarray does not need to perform tree rebalance.
> > Replacing RB tree to xarray can have some small performance gain.
> >
> > One small difference is that xarray insert might fail with ENOMEM, while
> > RB tree insert does not allocate additional memory.
> >
> > The zswap_entry size will reduce a bit due to removing the RB node, which
> > has two pointers and a color field.  Xarray store the pointer in the
> > xarray tree rather than the zswap_entry.  Every entry has one pointer from
> > the xarray tree.  Overall, switching to xarray should save some memory, if
> > the swap entries are densely packed.
> >
> > Notice the zswap_rb_search and zswap_rb_insert often followed by
> > zswap_rb_erase.  Use xa_erase and xa_store directly.  That saves one tree
> > lookup as well.
> >
> > Remove zswap_invalidate_entry due to no need to call zswap_rb_erase any
> > more.  Use zswap_free_entry instead.
> >
> > The "struct zswap_tree" has been replaced by "struct xarray".  The tree
> > spin lock has transferred to the xarray lock.
> >
> > Run the kernel build testing 5 times for each version, averages:
> > (memory.max=2GB, zswap shrinker and writeback enabled, one 50GB swapfile,
> > 24 HT core, 32 jobs)
> >
> >            mm-unstable-4aaccadb5c04     xarray v9
> > user       3548.902                     3534.375
> > sys        522.232                      520.976
> > real       202.796                      200.864
> >
> > Signed-off-by: Chris Li <chrisl@xxxxxxxxxx>
>
> I remove the previous review tags because I like to get some review of
> the conflict resolution as well.
[..]
> > @@ -1624,20 +1562,14 @@ bool zswap_load(struct folio *folio)
> >         pgoff_t offset = swp_offset(swp);
> >         struct page *page = &folio->page;
> >         bool swapcache = folio_test_swapcache(folio);
> > -       struct zswap_tree *tree = swap_zswap_tree(swp);
> > +       struct xarray *tree = swap_zswap_tree(swp);
> >         struct zswap_entry *entry;
> >         u8 *dst;
> >
> >         VM_WARN_ON_ONCE(!folio_test_locked(folio));
> >
> > -       spin_lock(&tree->lock);
> > -       entry = zswap_rb_search(&tree->rbroot, offset);
> > -       if (!entry) {
> > -               spin_unlock(&tree->lock);
> > -               return false;
> > -       }
> >         /*
> > -        * When reading into the swapcache, invalidate our entry. The
> > +        * When reading into the swapcache, erase our entry. The
> >          * swapcache can be the authoritative owner of the page and
> >          * its mappings, and the pressure that results from having two
> >          * in-memory copies outweighs any benefits of caching the
> > @@ -1649,8 +1581,12 @@ bool zswap_load(struct folio *folio)
> >          * the fault fails. We remain the primary owner of the entry.)
> >          */
> >         if (swapcache)
> > -               zswap_rb_erase(&tree->rbroot, entry);
> > -       spin_unlock(&tree->lock);
> > +               entry = xa_erase(tree, offset);
> > +       else
> > +               entry = xa_load(tree, offset);
>
> This is the place I make the modification for the conflict resolution.
> It depends on the swapcache to execute xa_erase() or xa_load().
> Obviously, the xa_load() will not delete the entry from the tree.

The conflict resolution LGTM. If this is the only change from v8 then:

Acked-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>