Re: [PATCH] mm: Do not start/end writeback for pages stored in zswap

Yosry Ahmed <yosryahmed@xxxxxxxxxx> · Mon, 10 Jun 2024 13:05:25 -0700

On Mon, Jun 10, 2024 at 12:08 PM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote:
>
> On Mon, Jun 10, 2024 at 10:31:36AM GMT, Yosry Ahmed wrote:
> > On Mon, Jun 10, 2024 at 7:31 AM Usama Arif <usamaarif642@xxxxxxxxx> wrote:
> > >
> > > start/end writeback combination incorrectly increments NR_WRITTEN
> > > counter, eventhough the pages aren't written to disk. Pages successfully
> > > stored in zswap should just unlock folio and return from writepage.
> > >
> > > Signed-off-by: Usama Arif <usamaarif642@xxxxxxxxx>
> > > ---
> > >  mm/page_io.c | 2 --
> > >  1 file changed, 2 deletions(-)
> > >
> > > diff --git a/mm/page_io.c b/mm/page_io.c
> > > index a360857cf75d..501784d79977 100644
> > > --- a/mm/page_io.c
> > > +++ b/mm/page_io.c
> > > @@ -196,9 +196,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc)
> > >                 return ret;
> > >         }
> > >         if (zswap_store(folio)) {
> > > -               folio_start_writeback(folio);
> > >                 folio_unlock(folio);
> > > -               folio_end_writeback(folio);
> >
> > Removing these calls will have several effects, I am not really sure it's safe.
> >
> > 1. As you note in the commit log, NR_WRITTEN stats (and apparently
> > others) will no longer be updated. While this may make sense, it's a
> > user-visible change. I am not sure if anyone relies on this.
> >
>
> I couldn't imagine how this stat can be useful for the zswap case and I
> don't see much risk in changing this stat behavior for such cases.

It seems like NR_WRITTEN is only used in 'global_dirty_state' trace event.

NR_WRITEBACK and NR_ZONE_WRITE_PENDING are state counters, not event
counters. They are incremented in folio_start_writeback() and
decremented in folio_end_writeback(). They are probably just causing
noise.

I think for both cases it's probably fine and not really visible to userspace.

>
> > 2. folio_end_writeback() calls folio_rotate_reclaimable() after
> > writeback completes to put a folio that has been marked with
> > PG_reclaim at the tail of the LRU, to be reclaimed first next time. Do
> > we get this call through other paths now?
> >
>
> The folio_rotate_reclaimable() only makes sense for async writeback
> pages i.e. not for zswap where we synchronously reclaim the page.

Looking at pageout(), it seems like we will clear PG_reclaim if the
folio is not under writeback, and in shrink_folio_list() if the folio
is not dirty or under writeback, we will reclaim right away. I thought
zswap being synchronous was an odd case, but apparently there is wider
support for synchronous reclaim.

Thanks for pointing this out.

>
> > 3. If I remember correctly, there was some sort of state machine where
> > folios go from dirty to writeback to clean. I am not sure what happens
> > if we take the writeback phase out of the equation.
> >
>
> Is there really such a state machine? We only trigger writeback if the
> page is dirty and we have cleared it. The only thing I can think of is
> the behavior of the waiters on PG_locked bit but the window of
> PG_writeback is so small that it seems like it does not matter.

I remember Matthew talking about it during LSF/MM this year when he
was discussing page flags, but maybe I am misremembering.