Re: [PATCH 1/1] NFS: Fix potential oops in nfs_inode_remove_request()

Jeff Layton <jlayton@xxxxxxxxxx> · Wed, 26 Jul 2023 08:40:33 -0400

On Tue, 2023-07-25 at 17:41 +0000, Trond Myklebust wrote:
> On Tue, 2023-07-25 at 12:24 -0400, Scott Mayhew wrote:
> > On Tue, 25 Jul 2023, Trond Myklebust wrote:
> > 
> > > On Tue, 2023-07-25 at 11:08 -0400, Scott Mayhew wrote:
> > > > Once a folio's private data has been cleared, it's possible for
> > > > another
> > > > process to clear the folio->mapping (e.g. via
> > > > invalidate_complete_folio2
> > > > or evict_mapping_folio), so it wouldn't be safe to call
> > > > nfs_page_to_inode() after that.
> > > > 
> > > > Fixes: 0c493b5cf16e ("NFS: Convert buffered writes to use
> > > > folios")
> > > > Signed-off-by: Scott Mayhew <smayhew@xxxxxxxxxx>
> > > > ---
> > > >  fs/nfs/write.c | 4 +++-
> > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> > > > index f4cca8f00c0c..489c3f9dae23 100644
> > > > --- a/fs/nfs/write.c
> > > > +++ b/fs/nfs/write.c
> > > > @@ -785,6 +785,8 @@ static void nfs_inode_add_request(struct
> > > > nfs_page
> > > > *req)
> > > >   */
> > > >  static void nfs_inode_remove_request(struct nfs_page *req)
> > > >  {
> > > > +       struct nfs_inode *nfsi = NFS_I(nfs_page_to_inode(req));
> > > > +
> > > >         if (nfs_page_group_sync_on_bit(req, PG_REMOVE)) {
> > > >                 struct folio *folio = nfs_page_to_folio(req-
> > > > > wb_head);
> > > >                 struct address_space *mapping =
> > > > folio_file_mapping(folio);
> > > > @@ -800,7 +802,7 @@ static void nfs_inode_remove_request(struct
> > > > nfs_page *req)
> > > >  
> > > >         if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) {
> > > >                 nfs_release_request(req);
> > > > -               atomic_long_dec(&NFS_I(nfs_page_to_inode(req))-
> > > > > nrequests);
> > > > +               atomic_long_dec(&nfsi->nrequests);
> > > 
> > > Why not just invert the order of the atomic_long_dec() and the
> > > nfs_release_request()? That way you are also ensuring that the
> > > inode is
> > > still pinned in memory by the open context.
> > 
> > I'm not following.  How does inverting the order prevent the
> > folio->mapping from getting clobbered?
> > 
> 
> The open/lock context is refcounted by the nfs_page until the latter is
> released. That's why the inode is guaranteed to remain around at least
> until the  call to nfs_release_request().
> 

The problem is not that the inode is going away, but rather that we
can't guarantee that the page is still part of the mapping at this
point, and so we can't safely dereference page->mapping there. I do see
that nfs_release_request releases a reference to the page, but I don't
think that's sufficient to ensure that it remains part of the mapping.

AFAICT, once we clear page->private, the page is subject to be removed
from the mapping. So, I *think* it's safe to just move the call to
nfs_page_to_inode prior to the call to nfs_page_group_sync_on_bit.
-- 
Jeff Layton <jlayton@xxxxxxxxxx>