Patch "btrfs: don't readahead the relocation inode on RST" has been added to the 6.11-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    btrfs: don't readahead the relocation inode on RST

to the 6.11-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     btrfs-don-t-readahead-the-relocation-inode-on-rst.patch
and it can be found in the queue-6.11 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit ae40ffac9faf5d58a2c171a33acdda5980a5913c
Author: Johannes Thumshirn <jthumshirn@xxxxxxx>
Date:   Wed Jul 31 22:43:06 2024 +0200

    btrfs: don't readahead the relocation inode on RST
    
    [ Upstream commit 04915240e2c3a018e4c7f23418478d27226c8957 ]
    
    On relocation we're doing readahead on the relocation inode, but if the
    filesystem is backed by a RAID stripe tree we can get ENOENT (e.g. due to
    preallocated extents not being mapped in the RST) from the lookup.
    
    But readahead doesn't handle the error and submits invalid reads to the
    device, causing an assertion in the scatter-gather list code:
    
      BTRFS info (device nvme1n1): balance: start -d -m -s
      BTRFS info (device nvme1n1): relocating block group 6480920576 flags data|raid0
      BTRFS error (device nvme1n1): cannot find raid-stripe for logical [6481928192, 6481969152] devid 2, profile raid0
      ------------[ cut here ]------------
      kernel BUG at include/linux/scatterlist.h:115!
      Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
      CPU: 0 PID: 1012 Comm: btrfs Not tainted 6.10.0-rc7+ #567
      RIP: 0010:__blk_rq_map_sg+0x339/0x4a0
      RSP: 0018:ffffc90001a43820 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffea00045d4802
      RDX: 0000000117520000 RSI: 0000000000000000 RDI: ffff8881027d1000
      RBP: 0000000000003000 R08: ffffea00045d4902 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000001000 R12: ffff8881003d10b8
      R13: ffffc90001a438f0 R14: 0000000000000000 R15: 0000000000003000
      FS:  00007fcc048a6900(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000002cd11000 CR3: 00000001109ea001 CR4: 0000000000370eb0
      Call Trace:
       <TASK>
       ? __die_body.cold+0x14/0x25
       ? die+0x2e/0x50
       ? do_trap+0xca/0x110
       ? do_error_trap+0x65/0x80
       ? __blk_rq_map_sg+0x339/0x4a0
       ? exc_invalid_op+0x50/0x70
       ? __blk_rq_map_sg+0x339/0x4a0
       ? asm_exc_invalid_op+0x1a/0x20
       ? __blk_rq_map_sg+0x339/0x4a0
       nvme_prep_rq.part.0+0x9d/0x770
       nvme_queue_rq+0x7d/0x1e0
       __blk_mq_issue_directly+0x2a/0x90
       ? blk_mq_get_budget_and_tag+0x61/0x90
       blk_mq_try_issue_list_directly+0x56/0xf0
       blk_mq_flush_plug_list.part.0+0x52b/0x5d0
       __blk_flush_plug+0xc6/0x110
       blk_finish_plug+0x28/0x40
       read_pages+0x160/0x1c0
       page_cache_ra_unbounded+0x109/0x180
       relocate_file_extent_cluster+0x611/0x6a0
       ? btrfs_search_slot+0xba4/0xd20
       ? balance_dirty_pages_ratelimited_flags+0x26/0xb00
       relocate_data_extent.constprop.0+0x134/0x160
       relocate_block_group+0x3f2/0x500
       btrfs_relocate_block_group+0x250/0x430
       btrfs_relocate_chunk+0x3f/0x130
       btrfs_balance+0x71b/0xef0
       ? kmalloc_trace_noprof+0x13b/0x280
       btrfs_ioctl+0x2c2e/0x3030
       ? kvfree_call_rcu+0x1e6/0x340
       ? list_lru_add_obj+0x66/0x80
       ? mntput_no_expire+0x3a/0x220
       __x64_sys_ioctl+0x96/0xc0
       do_syscall_64+0x54/0x110
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
      RIP: 0033:0x7fcc04514f9b
      Code: Unable to access opcode bytes at 0x7fcc04514f71.
      RSP: 002b:00007ffeba923370 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fcc04514f9b
      RDX: 00007ffeba923460 RSI: 00000000c4009420 RDI: 0000000000000003
      RBP: 0000000000000000 R08: 0000000000000013 R09: 0000000000000001
      R10: 00007fcc043fbba8 R11: 0000000000000246 R12: 00007ffeba924fc5
      R13: 00007ffeba923460 R14: 0000000000000002 R15: 00000000004d4bb0
       </TASK>
      Modules linked in:
      ---[ end trace 0000000000000000 ]---
      RIP: 0010:__blk_rq_map_sg+0x339/0x4a0
      RSP: 0018:ffffc90001a43820 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffea00045d4802
      RDX: 0000000117520000 RSI: 0000000000000000 RDI: ffff8881027d1000
      RBP: 0000000000003000 R08: ffffea00045d4902 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000001000 R12: ffff8881003d10b8
      R13: ffffc90001a438f0 R14: 0000000000000000 R15: 0000000000003000
      FS:  00007fcc048a6900(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fcc04514f71 CR3: 00000001109ea001 CR4: 0000000000370eb0
      Kernel panic - not syncing: Fatal exception
      Kernel Offset: disabled
      ---[ end Kernel panic - not syncing: Fatal exception ]---
    
    So in case of a relocation on a RAID stripe-tree based file system, skip
    the readahead.
    
    Reviewed-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
    Reviewed-by: Qu Wenruo <wqu@xxxxxxxx>
    Signed-off-by: Johannes Thumshirn <johannes.thumshirn@xxxxxxx>
    Reviewed-by: David Sterba <dsterba@xxxxxxxx>
    Signed-off-by: David Sterba <dsterba@xxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 0533d0f82dc99..ea4ed85919ec8 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -36,6 +36,7 @@
 #include "relocation.h"
 #include "super.h"
 #include "tree-checker.h"
+#include "raid-stripe-tree.h"
 
 /*
  * Relocation overview
@@ -2965,21 +2966,34 @@ static int relocate_one_folio(struct reloc_control *rc,
 	u64 folio_end;
 	u64 cur;
 	int ret;
+	const bool use_rst = btrfs_need_stripe_tree_update(fs_info, rc->block_group->flags);
 
 	ASSERT(index <= last_index);
 	folio = filemap_lock_folio(inode->i_mapping, index);
 	if (IS_ERR(folio)) {
-		page_cache_sync_readahead(inode->i_mapping, ra, NULL,
-					  index, last_index + 1 - index);
+
+		/*
+		 * On relocation we're doing readahead on the relocation inode,
+		 * but if the filesystem is backed by a RAID stripe tree we can
+		 * get ENOENT (e.g. due to preallocated extents not being
+		 * mapped in the RST) from the lookup.
+		 *
+		 * But readahead doesn't handle the error and submits invalid
+		 * reads to the device, causing a assertion failures.
+		 */
+		if (!use_rst)
+			page_cache_sync_readahead(inode->i_mapping, ra, NULL,
+						  index, last_index + 1 - index);
 		folio = __filemap_get_folio(inode->i_mapping, index,
-					    FGP_LOCK | FGP_ACCESSED | FGP_CREAT, mask);
+					    FGP_LOCK | FGP_ACCESSED | FGP_CREAT,
+					    mask);
 		if (IS_ERR(folio))
 			return PTR_ERR(folio);
 	}
 
 	WARN_ON(folio_order(folio));
 
-	if (folio_test_readahead(folio))
+	if (folio_test_readahead(folio) && !use_rst)
 		page_cache_async_readahead(inode->i_mapping, ra, NULL,
 					   folio, last_index + 1 - index);
 




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux