Date: Fri, 19 Jan 2024 19:23:52 +0000
From: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
To: Chris Li <chrisl@xxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx,
linux-mm@xxxxxxxxx, Wei =?utf-8?B?WHXvv7w=?= <weixugc@xxxxxxxxxx>,
Yu Zhao <yuzhao@xxxxxxxxxx>, Greg Thelen <gthelen@xxxxxxxxxx>,
Chun-Tse Shao <ctshao@xxxxxxxxxx>,
Suren =?utf-8?B?QmFnaGRhc2FyeWFu77+8?= <surenb@xxxxxxxxxx>,
Brain Geffon <bgeffon@xxxxxxxxxx>, Minchan Kim <minchan@xxxxxxxxxx>,
Michal Hocko <mhocko@xxxxxxxx>,
Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>,
Huang Ying <ying.huang@xxxxxxxxx>, Nhat Pham <nphamcs@xxxxxxxxx>,
Johannes Weiner <hannes@xxxxxxxxxxx>,
Kairui Song <kasong@xxxxxxxxxxx>,
Zhongkun He <hezhongkun.hzk@xxxxxxxxxxxxx>,
Kemeng Shi <shikemeng@xxxxxxxxxxxxxxx>,
Barry Song <v-songbaohua@xxxxxxxx>,
"Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx>,
"Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx>,
Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>,
Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
Subject: Re: [PATCH 1/2] mm: zswap.c: add xarray tree to zswap
Message-ID: <ZarMHYEqXCOeKau0@xxxxxxxxxx>
References: <20240117-zswap-xarray-v1-0-6daa86c08fae@xxxxxxxxxx>
<20240117-zswap-xarray-v1-1-6daa86c08fae@xxxxxxxxxx>
<CAJD7tkYEx57CPBoaN9GW4M3Mx-+jEsOMWJ02nLKSKD-MLb-WPA@xxxxxxxxxxxxxx>
<CAF8kJuO5tAqwyKQK7AasWgs3Ohfc2osD9oX0m8YAkfsAZsjjyQ@xxxxxxxxxxxxxx>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To:
<CAF8kJuO5tAqwyKQK7AasWgs3Ohfc2osD9oX0m8YAkfsAZsjjyQ@xxxxxxxxxxxxxx>
> > -static struct zswap_entry *zswap_rb_search(struct rb_root *root,
pgoff_t offset)
> > +static struct zswap_entry *zswap_search(struct zswap_tree *tree,
pgoff_t offset)
>
> Let's change the zswap_rb_* prefixes to zswap_tree_* instead of just
> zswap_*. Otherwise, it will be confusing to have both zswap_store and
> zswap_insert (as well as zswap_load and zswap_search).
How about zswap_xa_* ?
SGTM.
>
> [..]
> > @@ -1790,15 +1808,21 @@ void zswap_swapon(int type)
> > void zswap_swapoff(int type)
> > {
> > struct zswap_tree *tree = zswap_trees[type];
> > - struct zswap_entry *entry, *n;
> > + struct zswap_entry *entry, *e, *n;
> > + XA_STATE(xas, tree ? &tree->xarray : NULL, 0);
> >
> > if (!tree)
> > return;
> >
> > /* walk the tree and free everything */
> > spin_lock(&tree->lock);
> > +
> > + xas_for_each(&xas, e, ULONG_MAX)
>
> Why not use xa_for_each?
>
> > + zswap_invalidate_entry(tree, e);
> > +
> > rbtree_postorder_for_each_entry_safe(entry, n, &tree->rbroot,
rbnode)
> > - zswap_free_entry(entry);
>
> Replacing zswap_free_entry() with zswap_invalidate_entry() is a
> behavioral change that should be done separate from this series, but I
> am wondering why it's needed. IIUC, the swapoff code should be making
> sure there are no ongoing swapin/swapout operations, and there are no
> pages left in zswap to writeback.
>
> Is it the case that swapoff may race with writeback, such that
> writeback is holding the last remaining ref after zswap_invalidate()
> is called, and then zswap_swapoff() is called freeing the zswap entry
> while writeback is still accessing it?
For the RB tree the mapping is stored in the zswap entry as RB node.
That is different from xarray. Xarry stores the mapping outside of
zswap entry. Just freeing the entry does not remove the mapping from
xarray. Therefore it needs to call zswap_invalidate_entry() to remove
the entry from the xarray. I could call zswap_erase() then free entry.
I just think zswap_invalidate_entry() is more consistent with the rest
of the code.
Do we have to call xa_destroy() anyway to make sure everything is
cleaned up in the xarray? In that case, we can just do that after the
loop.