Re: [PATCH 08/16] btrfs: stop setting PageError in the data I/O path

David Sterba <dsterba@xxxxxxx> · Tue, 30 May 2023 15:34:46 +0200

On Tue, May 30, 2023 at 07:08:38AM +0100, Matthew Wilcox wrote:
> On Tue, May 30, 2023 at 07:45:47AM +0200, Christoph Hellwig wrote:
> > On Mon, May 29, 2023 at 07:52:10PM +0200, David Sterba wrote:
> > > On Tue, May 23, 2023 at 10:13:14AM +0200, Christoph Hellwig wrote:
> > > > PageError is not used by the VFS/MM and deprecated.
> > > 
> > > Is there some reference other than code? I remember LSF/MM talks about
> > > writeback, reducing page flags but I can't find anything that would say
> > > that PageError is not used anymore. For such core changes in the
> > > neighboring subysystems I kind of rely on lwn.net, hearsay or accidental
> > > notice on fsdevel.
> > > 
> > > Removing the Error bit handling looks overall good but I'd also need to
> > > refresh my understanding of the interactions with writeback.
> > 
> > willy is the driving force behind the PageErro removal, so maybe he
> > has a coherent writeup.  But the basic idea is:
> > 
> >  - page flag space availability is limited, and killing any one we can
> >    easily is a good thing
> >  - PageError is not well defined and not part of any VM or VFS contract,
> >    so in addition to freeing up space it also generally tends to remove
> >    confusion.
> 
> I don't think I have a coherent writeup.  Basically:
> 
>  - The VFS and VM do not check the error flag
> 
>    $ git grep folio_test_error
>    fs/gfs2/lops.c: if (folio_test_error(folio))
>    mm/migrate.c:   if (folio_test_error(folio))
> 
>    (the use in mm/migrate.c replicates the error flag on migration)
> 
>    A similar grep -w PageError finds only tests in filesystems and
>    comments.
> 
>  - We do not document clearly under what circumstances the error flag
>    is set or cleared
> 
> If we can remove all uses of testing the error flag, then we can remove
> everywhere that sets or clears the flag.  This is usually a simple
> matter of checking folio_test_uptodate() instead of !PageError(),
> since the folio will not be marked uptodate if there is a read error.
> Write errors are not normally tracked on a per-folio basis.  Instead they
> are tracked through mapping_set_error().
> 
> I did a number of these conversions about a year ago; I'm happy Christoph
> is making progress with btrfs here.  See 'git log 6e8e79fc8443' for many
> of them.

Thank you, that helped.

I found one case of folio_set_error() that seems to be redundant:

   189  static noinline void end_compressed_writeback(const struct compressed_bio *cb)
   190  {
   191          struct inode *inode = &cb->bbio.inode->vfs_inode;
   192          struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
   193          unsigned long index = cb->start >> PAGE_SHIFT;
   194          unsigned long end_index = (cb->start + cb->len - 1) >> PAGE_SHIFT;
   195          struct folio_batch fbatch;
   196          const int errno = blk_status_to_errno(cb->bbio.bio.bi_status);
   197          int i;
   198          int ret;
   199  
   200          if (errno)
   201                  mapping_set_error(inode->i_mapping, errno);
   202  
   203          folio_batch_init(&fbatch);
   204          while (index <= end_index) {
   205                  ret = filemap_get_folios(inode->i_mapping, &index, end_index,
   206                                  &fbatch);
   207  
   208                  if (ret == 0)
   209                          return;
   210  
   211                  for (i = 0; i < ret; i++) {
   212                          struct folio *folio = fbatch.folios[i];
   213  
   214                          if (errno)
   215                                  folio_set_error(folio);
                                        ^^^^^^^^^^^^^^^^^^^^^^^

The error is set on the mapping on line 201 under the same condition.

With that removed and the super block write error handling moved away
from the Error bit the btrfs code will be Error-clean.

   216                          btrfs_page_clamp_clear_writeback(fs_info, &folio->page,
   217                                                           cb->start, cb->len);
   218                  }
   219                  folio_batch_release(&fbatch);
   220          }
   221          /* the inode may be gone now */
   222  }