Re: [PATCH] mm: zswap: multiple zpool support

Johannes Weiner <hannes@xxxxxxxxxxx> · Fri, 2 Jun 2023 16:24:53 -0400

On Fri, Jun 02, 2023 at 12:14:28PM -0700, Yosry Ahmed wrote:
> On Fri, Jun 2, 2023 at 11:34 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> >
> > On Fri, Jun 02, 2023 at 09:59:20AM -0700, Yosry Ahmed wrote:
> > > On Fri, Jun 2, 2023 at 9:49 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> > > > Again, what about the zswap_tree.lock and swap_info_struct.lock?
> > > > They're the same scope unless you use multiple swap files. Would it
> > > > make sense to tie pools to trees, so that using multiple swapfiles for
> > > > concurrency purposes also implies this optimization?
> > >
> > > Yeah, using multiple swapfiles helps with those locks, but it doesn't
> > > help with the zpool lock.
> > >
> > > I am reluctant to take this path because I am trying to get rid of
> > > zswap's dependency on swapfiles to begin with, and have it act as its
> > > own standalone swapping backend. If I am successful, then having one
> > > zpool per zswap_tree is just a temporary fix.
> >
> > What about making the pools per-cpu?
> >
> > This would scale nicely with the machine size. And we commonly deal
> > with for_each_cpu() loops and per-cpu data structures, so have good
> > developer intuition about what's reasonable to squeeze into those.
> >
> > It would eliminate the lock contention, for everybody, right away, and
> > without asking questions.
> >
> > It would open the door to all kinds of locking optimizations on top.
> 
> The page can get swapped out on one cpu and swapped in on another, no?
> 
> We will need to store which zpool the page is stored in in its zswap
> entry, and potentially grab percpu locks from other cpus in the swap
> in path. The lock contention would probably be less, but certainly not
> eliminated.
> 
> Did I misunderstand?

Sorry, I should have been more precise.

I'm saying that using NR_CPUS pools, and replacing the hash with
smp_processor_id(), would accomplish your goal of pool concurrency.
But it would do so with a broadly-used, well-understood scaling
factor. We might not need a config option at all.

The lock would still be there, but contention would be reduced fairly
optimally (barring preemption) for store concurrency at least. Not
fully eliminated due to frees and compaction, though, yes.

I'm not proposing more than that at this point. I only wrote the last
line because already having per-cpu data structures might help with
fast path optimizations down the line, if contention is still an
issue. But unlikely. So it's not so important. Let's forget it.