Re: [External] Re: [PATCH] mm: zswap: fix the lack of page lru flag in zswap_writeback_entry

Nhat Pham <nphamcs@xxxxxxxxx> · Wed, 17 Jan 2024 11:29:29 -0800

On Wed, Jan 17, 2024 at 1:52 AM Zhongkun He
<hezhongkun.hzk@xxxxxxxxxxxxx> wrote:
>
> > >
> > > Please forgive me for adding additional information about this patch.
> > >
> > > I have finished the opt for introducing a folio_add_lru_tail(), but
> > > there are many
> > > questions:
> > > 1) A new page can be move to LRU only by lru_add_fn, so
> > >     folio_add_lru_tail could not add pages to LRU for the following code
> > >     in folio_batch_move_lru(),which is added by Alex Shi for
> > >     serializing memcg changes in pagevec_lru_move_fn[1].
> > >
> > > /* block memcg migration while the folio moves between lru */
> > >         if (move_fn != lru_add_fn && !folio_test_clear_lru(folio))
> > >             continue;
> > > To achieve the goal, we need to add a new function like  lru_add_fn
> > > which does not have the lru flag and folio_add_lru_tail()
> > > +               if (move_fn != lru_add_fn && move_fn != lru_move_tail_fn_new &&
> > > +                       !folio_test_clear_lru(folio))

Hmm yeah, I guess it is a bit more plumbing to do. I prefer this
though - not very fond of hacking current's flag just for a small
optimization :) And I'd argue this is the "right" thing to do -
draining the other LRU operation batches just so that we can
successfully perform an add-to-tail seems hacky and wrong to me.

> > >
> > > 2)  __read_swap_cache_async has six parameters, so there is no space to
> > > add a new one, add_to_lru_head.

Matthew's solution seems fine to me, no? i.e using a single flag
parameter to encapsulate all boolean arguments.

> > >
> > > So it seems a bit hacky just for a special case for the reasons above.
> >
> > It's a lot of plumbing for sure. Adding a flag to current task_struct
> > is a less-noisy yet-still-hacky solution. I am not saying we should do
> > it, but it's an option. I am not sure how much task flags we have to
> > spare.
>
> Got it.
> >
> > >
> > > Back to the beginning,  lru_add_drain() is the simplest option，which is common
> > > below the __read_swap_cache_async(). Please see the function
> > > swap_cluster_readahead()
> > > and swap_vma_readahead(), of course it has been batched.
> > >
> > > Or we should  leave this problem alone，before we can write back zswap
> > > in batches.
> >
> > Calling lru_add_drain() for every written back page is an overkill
> > imo. If we have writeback batching at some point, it may make more
> > sense then.
>
> Agree.

Agree. lru_add_drain() does quite a bit, and doing it for every
written page makes batching less effective. And as argued above, I
don't think we should do this.

I'm fine with waiting til writeback batching too :) But that will be a
bigger task.

>
> >
> > Adding Michal Hocko was recently complaining [1] about lru_add_drain()
> > being called unnecessarily elsewhere.
>
> Got it, thanks.
> >
> > [1]https://lore.kernel.org/linux-mm/ZaD9BNtXZfY2UtVI@tiehlicka/