Re: [REGRESSION][BISECTED] Crash with Bad page state for FUSE/Flatpak related applications since v6.13

Joanne Koong <joannelkoong@xxxxxxxxx> · Tue, 11 Feb 2025 11:23:45 -0800

On Tue, Feb 11, 2025 at 6:01 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>
> On Mon, 2025-02-10 at 17:38 -0500, Jeff Layton wrote:
> > On Mon, 2025-02-10 at 20:36 +0000, Matthew Wilcox wrote:
> > > On Mon, Feb 10, 2025 at 02:12:35PM -0500, Josef Bacik wrote:
> > > > From: Josef Bacik <josef@xxxxxxxxxxxxxx>
> > > > Date: Mon, 10 Feb 2025 14:06:40 -0500
> > > > Subject: [PATCH] fuse: drop extra put of folio when using pipe splice
> > > >
> > > > In 3eab9d7bc2f4 ("fuse: convert readahead to use folios"), I converted
> > > > us to using the new folio readahead code, which drops the reference on
> > > > the folio once it is locked, using an inferred reference on the folio.
> > > > Previously we held a reference on the folio for the entire duration of
> > > > the readpages call.
> > > >
> > > > This is fine, however I failed to catch the case for splice pipe
> > > > responses where we will remove the old folio and splice in the new
> > > > folio.  Here we assumed that there is a reference held on the folio for
> > > > ap->folios, which is no longer the case.
> > > >
> > > > To fix this, simply drop the extra put to keep us consistent with the
> > > > non-splice variation.  This will fix the UAF bug that was reported.
> > > >
> > > > Link: https://lore.kernel.org/linux-fsdevel/2f681f48-00f5-4e09-8431-2b3dbfaa881e@xxxxxxxxx/
> > > > Fixes: 3eab9d7bc2f4 ("fuse: convert readahead to use folios")
> > > > Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
> > > > ---
> > > >  fs/fuse/dev.c | 2 --
> > > >  1 file changed, 2 deletions(-)
> > > >
> > > > diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> > > > index 5b5f789b37eb..5bd6e2e184c0 100644
> > > > --- a/fs/fuse/dev.c
> > > > +++ b/fs/fuse/dev.c
> > > > @@ -918,8 +918,6 @@ static int fuse_try_move_page(struct fuse_copy_state *cs, struct page **pagep)
> > > >   }
> > > >
> > > >   folio_unlock(oldfolio);
> > > > - /* Drop ref for ap->pages[] array */
> > > > - folio_put(oldfolio);
> > > >   cs->len = 0;
> > >
> > > But aren't we now leaking a reference to newfolio?  ie shouldn't
> > > we also:
> > >
> > > -   folio_get(newfolio);
> > >
> > > a few lines earlier?
> > >
> >
> >
> > I think that ref was leaking without Josef's patch, but your proposed
> > fix seems correct to me. There is:
> >
> > - 1 reference stolen from the pipe_buffer
> > - 1 reference taken for the pagecache in replace_page_cache_folio()
> > - the folio_get(newfolio) just after that
> >
> > The pagecache ref doesn't count here, and we only need the reference
> > that was stolen from the pipe_buffer to replace the one in pagep.
>
> Actually, no. I'm wrong here. A little after the folio_get(newfolio)
> call, we do:
>
>         /*
>          * Release while we have extra ref on stolen page.  Otherwise
>          * anon_pipe_buf_release() might think the page can be reused.
>          */
>         pipe_buf_release(cs->pipe, buf);
>
> ...so that accounts for the extra reference. I think the newfolio
> refcounting is correct as-is.

I think we do need to remove the folio_get(newfolio); here or we are
leaking the reference.

new_folio = page_folio(buf->page) # ref is 1
replace_page_cache_folio() # ref is 2
folio_get() # ref is 3
pipe_buf_release() # ref is 2

One ref belongs to the page cache and will get dropped by that, but
the other ref is unaccounted for (since the original patch removed
"folio_put()" from fuse_readpages_end()).

I still think acquiring an explicit reference on the folio before we
add it to ap->folio and then dropping it when we're completely done
with it in fuse_readpages_end() is the best solution, as that imo
makes the refcounting / lifetimes the most explicit / clear. For
example, in try_move_pages(), if we get rid of that "folio_get()"
call, the page cache is the holder of the remaining reference on it,
and we rely on the earlier "folio_clear_uptodate(newfolio);" line in
try_move_pages() to guarantee that the newfolio isn't freed out from
under us if memory gets tight and it's evicted from the page cache.

imo, a patch like this makes the refcounting the most clear: