Re: [PATCH v5 3/6] zswap: make shrinking memcg-aware

Nhat Pham <nphamcs@xxxxxxxxx> · Mon, 6 Nov 2023 15:25:05 -0800

On Mon, Nov 6, 2023 at 12:58 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
>
> > >
> > > This lock is only needed to synchronize updating pool->next_shrink,
> > > right? Can we just use atomic operations instead? (e.g. cmpxchg()).
> >
> > I'm not entirely sure. I think in the pool destroy path, we have to also
> > put the next_shrink memcg, so there's that.
>
> We can use xchg() to replace it with NULL, then put the memcg ref, no?
>
> We can also just hold zswap_pools_lock while shrinking the memcg
> perhaps? It's not a contended lock anyway. It just feels weird to add
> a spinlock to protect one pointer.

Ah this sounds good to me I guess. I'm not opposed to this simplification
of the concurrency scheme.

>
> >
> > >
> > > > +               if (pool->next_shrink == memcg)
> > > > +                       pool->next_shrink =
> > > > +                               mem_cgroup_iter(NULL, pool->next_shrink, NULL, true);
> > > > +               spin_unlock(&pool->next_shrink_lock);
> > > > +       }
> > > > +       spin_unlock(&zswap_pools_lock);
> > > > +}
> > > > +
> > > >  /*********************************
> > > >  * zswap entry functions
> > > >  **********************************/
> > > >  static struct kmem_cache *zswap_entry_cache;
> > > >
> > > > -static struct zswap_entry *zswap_entry_cache_alloc(gfp_t gfp)
> > > > +static struct zswap_entry *zswap_entry_cache_alloc(gfp_t gfp, int nid)
> > > >  {
> > > >         struct zswap_entry *entry;
> > > > -       entry = kmem_cache_alloc(zswap_entry_cache, gfp);
> > > > +       entry = kmem_cache_alloc_node(zswap_entry_cache, gfp, nid);
> > > >         if (!entry)
> > > >                 return NULL;
> > > >         entry->refcount = 1;
> > > [..]
> > > > @@ -1233,15 +1369,15 @@ bool zswap_store(struct folio *folio)
> > > >                 zswap_invalidate_entry(tree, dupentry);
> > > >         }
> > > >         spin_unlock(&tree->lock);
> > > > -
> > > > -       /*
> > > > -        * XXX: zswap reclaim does not work with cgroups yet. Without a
> > > > -        * cgroup-aware entry LRU, we will push out entries system-wide based on
> > > > -        * local cgroup limits.
> > > > -        */
> > > >         objcg = get_obj_cgroup_from_folio(folio);
> > > > -       if (objcg && !obj_cgroup_may_zswap(objcg))
> > > > -               goto reject;
> > > > +       if (objcg && !obj_cgroup_may_zswap(objcg)) {
> > > > +               memcg = get_mem_cgroup_from_objcg(objcg);
> > > > +               if (shrink_memcg(memcg)) {
> > > > +                       mem_cgroup_put(memcg);
> > > > +                       goto reject;
> > > > +               }
> > > > +               mem_cgroup_put(memcg);
> > >
> > > Can we just use RCU here as well? (same around memcg_list_lru_alloc()
> > > call below).
> >
> > For memcg_list_lru_alloc(): there's potentially sleeping in that piece of
> > code I believe? I believe at the very least we'll have to use this gfp_t
> > flag for it to be rcu-safe:
> >
> > GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN
> > not sure the
> >
> > Same go for this particular place IIRC - there's some sleeping done
> > in zswap_writeback_entry(), correct?
>
> Ah right, I missed this. My bad.