Re: [PATCH] btrfs: scrub: fix false alerts on zoned device scrubing

Qu Wenruo <wqu@xxxxxxxx> · Tue, 5 Mar 2024 19:27:30 +1030

在 2024/3/5 19:18, Naohiro Aota 写道:
On Tue, Mar 05, 2024 at 01:18:22PM +1030, Qu Wenruo wrote:
[BUG]
When using zoned devices (zbc), scrub would always report super block
errors like the following:

   # btrfs scrub start -fB /mnt/btrfs/
   Starting scrub on devid 1
   scrub done for b7b5c759-1baa-4561-a0ca-b8d0babcde56
   Scrub started:    Tue Mar  5 12:49:14 2024
   Status:           finished
   Duration:         0:00:00
   Total to scrub:   288.00KiB
   Rate:             288.00KiB/s
   Error summary:    super=2
     Corrected:      0
     Uncorrectable:  0
     Unverified:     0

[CAUSE]
Since the very beginning of scrub, we always go with btrfs_sb_offset()
to grab the super blocks.
This is fine for regular btrfs filesystems, but for zoned btrfs, super
blocks are stored in dedicated zones with a ring buffer like structure.

This means the old btrfs_sb_offset() is not able to give the correct
bytenr for us to grabbing the super blocks, thus except the primary
super block, the rest would be garbage and cause the above false alerts.

[FIX]
Instead of btrfs_sb_offset(), go with btrfs_sb_log_location() which is
zoned friendly, to grab the correct super block location.

This would introduce new error patterns, as btrfs_sb_log_location() can
fail with extra errors.

Here for -ENOENT we just end the scrub as there are no more super
blocks.
For other errors, we record it as a super block error and exit.

Reported-by: WA AM <waautomata@xxxxxxxxx>
Link: https://lore.kernel.org/all/CANU2Z0EvUzfYxczLgGUiREoMndE9WdQnbaawV5Fv5gNXptPUKw@xxxxxxxxxxxxxx/
CC: stable@xxxxxxxxxxxxxxx # 5.15+
Signed-off-by: Johannes Thumshirn <Johannes.Thumshirn@xxxxxxx>
Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
---
  fs/btrfs/scrub.c | 13 +++++++++++--
  1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index c4bd0e60db59..e1b67baa4072 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -2788,7 +2788,6 @@ static noinline_for_stack int scrub_supers(struct scrub_ctx *sctx,
  					   struct btrfs_device *scrub_dev)
  {
  	int	i;
-	u64	bytenr;
  	u64	gen;
  	int ret = 0;
  	struct page *page;
@@ -2812,7 +2811,17 @@ static noinline_for_stack int scrub_supers(struct scrub_ctx *sctx,
  		gen = btrfs_get_last_trans_committed(fs_info);
  
  	for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) {
-		bytenr = btrfs_sb_offset(i);
+		u64 bytenr;
+
+		ret = btrfs_sb_log_location(scrub_dev, i, 0, &bytenr);
+		if (ret == -ENOENT)
+			break;
+		if (ret < 0) {
+			spin_lock(&sctx->stat_lock);
+			sctx->stat.super_errors++;
+			spin_unlock(&sctx->stat_lock);
+			break;

Since an error from scrub_one_super can continue, this can be "continue"
also? > E.g, if btrfs_sb_log_location() returns -EUCLEAN on the 2nd SB, it
fails to detect the 3rd SB's corruption.

You're right, I originally though it can return error from 
blkdev_zone_mgmt() and if that failed we can no longer continue.

But that can only happen in write case, meanwhile we're in the READ path.
And in that case, sb_write_pointer() can return -ECULEAN and we can 
still continue.

Would allow continue for non-ENOENT errors.

Thanks,
Qu

Other than that, looks good.

Reviewed-by: Naohiro Aota <naohiro.aota@xxxxxxx>

+		}
  		if (bytenr + BTRFS_SUPER_INFO_SIZE >
  		    scrub_dev->commit_total_bytes)
  			break;
--
2.44.0