Re: [PATCH 04/17] btrfs: handle checksum validation and repair at the storage layer

Josef Bacik <josef@xxxxxxxxxxxxxx> · Wed, 7 Sep 2022 14:15:22 -0400

On Thu, Sep 01, 2022 at 10:42:03AM +0300, Christoph Hellwig wrote:
> Currently btrfs handles checksum validation and repair in the end I/O
> handler for the btrfs_bio.  This leads to a lot of duplicate code
> plus issues with variying semantics or bugs, e.g.
> 
>  - the until recently completetly broken repair for compressed extents
>  - the fact that encoded reads validate the checksums but do not kick
>    of read repair
>  - the inconsistent checking of the BTRFS_FS_STATE_NO_CSUMS flag
> 
> This commit revamps the checksum validation and repair code to instead
> work below the btrfs_submit_bio interfaces.  For this to work we need
> to make sure an inode is available, so that is added as a parameter
> to btrfs_bio_alloc.  With that btrfs_submit_bio can preload
> btrfs_bio.csum from the csum tree without help from the upper layers,
> and the low-level I/O completion can iterate over the bio and verify
> the checksums.
> 
> In case of a checksum failure (or a plain old I/O error), the repair
> is now kicked off before the upper level ->end_io handler is invoked.
> Tracking of the repair status is massively simplified by just keeping
> a small failed_bio structure per bio with failed sectors and otherwise
> using the information in the repair bio.  The per-inode I/O failure
> tree can be entirely removed.
> 
> The saved bvec_iter in the btrfs_bio is now competely managed by
> btrfs_submit_bio and must not be accessed by the callers.
> 
> There is one significant behavior change here:  If repair fails or
> is impossible to start with, the whole bio will be failed to the
> upper layer.  This is the behavior that all I/O submitters execept
> for buffered I/O already emulated in their end_io handler.  For
> buffered I/O this now means that a large readahead request can
> fail due to a single bad sector, but as readahead errors are igored
> the following readpage if the sector is actually accessed will
> still be able to read.  This also matches the I/O failure handling
> in other file systems.
> 
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>

Generally the change itself is fine, but there's several whitespace errors.
Additionally this is sort of massive, I would prefer if you added the
functionality, removing the various calls to the old io failure rec stuff, and
then had a follow up patch to remove the old io failure code.  This makes it
easier for reviewers to parse what is important to pay attention to and what can
easily be ignored.  Clearly I've already reviewed it, but if you rework it more
than fixing the whitespace issues it would be nice to split the changes into
two.  Thanks,

Josef