Re: [PATCH 02/17] btrfs: save bio::bi_iter into btrfs_bio::iter before submitting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2021/12/1 13:17, Qu Wenruo wrote:
Since block layer will advance bio::bi_iter, at endio time we can no
longer rely on bio::bi_iter for split bio.

But for the incoming btrfs_bio split at btrfs_map_bio() time, we have to
ensure endio function is only executed for the split range, not the
whole original bio.

Thus this patch will introduce a new helper, btrfs_bio_save_iter(), to
save bi_iter into btrfs_bio::iter.

The following call sites need this helper call:

- btrfs_submit_compressed_read()
   For compressed read. For compressed write it doesn't really care as
   they use ordered extent.

- raid56_parity_write()
- raid56_parity_recovery()
   For RAID56.

- submit_stripe_bio()
   For all other cases.

These are not enough.

There are cases where we allocate a bio but without going through btrfs_map_bio(), and error out.

In that case, those bios don't have bbio::iter, and can cause errors in generic/475 related to data/metadata writeback failure.

Fixed in my github repo, by just adding more btrfs_bio_save_iter() calls in error paths.

Thanks,
Qu

Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
---
  fs/btrfs/compression.c |  3 +++
  fs/btrfs/raid56.c      |  2 ++
  fs/btrfs/volumes.c     | 14 ++++++++++++++
  fs/btrfs/volumes.h     | 18 ++++++++++++++++++
  4 files changed, 37 insertions(+)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index e776956d5bc9..cc8d13369f53 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -870,6 +870,9 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
  	/* include any pages we added in add_ra-bio_pages */
  	cb->len = bio->bi_iter.bi_size;
+ /* Save bi_iter so that end_bio_extent_readpage() won't freak out. */
+	btrfs_bio_save_iter(btrfs_bio(bio));
+
  	while (cur_disk_byte < disk_bytenr + compressed_len) {
  		u64 offset = cur_disk_byte - disk_bytenr;
  		unsigned int index = offset >> PAGE_SHIFT;
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 0e239a4c3b26..13e726c88a81 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -1731,6 +1731,7 @@ int raid56_parity_write(struct bio *bio, struct btrfs_io_context *bioc,
  		return PTR_ERR(rbio);
  	}
  	bio_list_add(&rbio->bio_list, bio);
+	btrfs_bio_save_iter(btrfs_bio(bio));
  	rbio->bio_list_bytes = bio->bi_iter.bi_size;
  	rbio->operation = BTRFS_RBIO_WRITE;
@@ -2135,6 +2136,7 @@ int raid56_parity_recover(struct bio *bio, struct btrfs_io_context *bioc, rbio->operation = BTRFS_RBIO_READ_REBUILD;
  	bio_list_add(&rbio->bio_list, bio);
+	btrfs_bio_save_iter(btrfs_bio(bio));
  	rbio->bio_list_bytes = bio->bi_iter.bi_size;
rbio->faila = find_logical_bio_stripe(rbio, bio);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index f38c230111be..b70037cc1a51 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6829,6 +6829,20 @@ blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio,
  		BUG();
  	}
+ /*
+	 * At endio time, bi_iter is no longer reliable, thus we have to save
+	 * current bi_iter into btrfs_bio so that even for split bio we can
+	 * iterate only the split part.
+	 *
+	 * And this has to be done before any bioc error, as endio functions
+	 * will rely on bbio::iter.
+	 *
+	 * For bio create by btrfs_bio_slit() or btrfs_bio_clone*(), it's
+	 * already set, but we can still have original bio which has its
+	 * iter not initialized.
+	 */
+	btrfs_bio_save_iter(btrfs_bio(bio));
+
  	for (dev_nr = 0; dev_nr < total_devs; dev_nr++) {
  		dev = bioc->stripes[dev_nr].dev;
  		if (!dev || !dev->bdev || test_bit(BTRFS_DEV_STATE_MISSING,
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 3b8130680749..f9178d2c2fd6 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -334,6 +334,12 @@ struct btrfs_bio {
  	struct btrfs_device *device;
  	u8 *csum;
  	u8 csum_inline[BTRFS_BIO_INLINE_CSUM_SIZE];
+	/*
+	 * Saved bio::bi_iter before submission.
+	 *
+	 * This allows us to interate the cloned/split bio properly, as at
+	 * endio time bio::bi_iter is no longer reliable.
+	 */
  	struct bvec_iter iter;
/*
@@ -356,6 +362,18 @@ static inline void btrfs_bio_free_csum(struct btrfs_bio *bbio)
  	}
  }
+/*
+ * To save bbio::bio->bi_iter into bbio::iter so for callers who need the
+ * original bi_iter can access the original part of the bio.
+ * This is especially important for the incoming split btrfs_bio, which needs
+ * to call its endio for and only for the split range.
+ */
+static inline void btrfs_bio_save_iter(struct btrfs_bio *bbio)
+{
+	if (!bbio->iter.bi_size)
+		bbio->iter = bbio->bio.bi_iter;
+}
+
  struct btrfs_io_stripe {
  	struct btrfs_device *dev;
  	u64 physical;





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux