Re: [PATCH v2] btrfs: handle missing chunk mapping more gracefully

Qu Wenruo <quwenruo.btrfs@xxxxxxx> · Sun, 11 Feb 2024 14:52:26 +1030

On 2024/2/11 13:43, Celeste Liu wrote:
On 2024-02-11 03:58, Qu Wenruo wrote:



On 2024/2/11 01:04, Celeste Liu wrote:
On 3/2/23 09:54, Qu Wenruo wrote:
[BUG]
During my scrub rework, I did a stupid thing like this:

           bio->bi_iter.bi_sector = stripe->logical;
           btrfs_submit_bio(fs_info, bio, stripe->mirror_num);

Above bi_sector assignment is using logical address directly, which
lacks ">> SECTOR_SHIFT".

This results a read on a range which has no chunk mapping.

This results the following crash:

    BTRFS critical (device dm-1): unable to find logical 11274289152 length 65536
    assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387
    ------------[ cut here ]------------

Sure this is all my fault, but this shows a possible problem in real
world, that some bitflip in file extents/tree block can point to
unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT
is not configured, cause invalid pointer.

[PROBLEMS]
In above call chain, we just don't handle the possible error from
btrfs_get_chunk_map() inside __btrfs_map_block().

[FIX]
The fix is pretty straightforward, just replace the ASSERT() with proper
error handling.

Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
---
Changelog:
v2:
- Rebased to latest misc-next
     The error path in bio.c is already fixed, thus only need to replace
     the ASSERT() in __btrfs_map_block().
---
    fs/btrfs/volumes.c | 3 ++-
    1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 4d479ac233a4..93bc45001e68 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6242,7 +6242,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
            return -EINVAL;

        em = btrfs_get_chunk_map(fs_info, logical, *length);
-    ASSERT(!IS_ERR(em));
+    if (IS_ERR(em))
+        return PTR_ERR(em);

        map = em->map_lookup;
        data_stripes = nr_data_stripes(map);

This bug affects 6.1.y LTS branch but no backport commit of this fix in
6.1.y branch. Please include it. Thanks.

Just want to make sure, are you hitting the ASSERT()?

No, in non-debug build, ASSERT() is noop. So em will be a pointer whose
value is errno and there is no any error handle for it. And then it
trigger kernel NULL Pointer Dereference exception in btrfs_get_io_geometry()
with em->map_lookup (Of course, it's not 0 but 0x5a
(-EINVAL + offsetof(map_lookup)). But it still is a illegal memory access),
this kernel thread was killed but the lock hold by this thread are not
released. After that, all fs operations was block by the lock.

I know, what I mean "hitting the ASSERT()" includes the cases where
CONFIG_BTRFS_ASSERT is not enabled.
As it would lead to an obvious invalid memory access immediately.

And lock/resources thing is the last thing you need to bother, as long
as the kernel fs module triggered invalid memory access inside kernel
space, it's busted already.



If so, please provide the calltrace and btrfs-check output to pin down
the cause.

It doesn't happen on my machine, I've notified the finder to send a
backtrace to the mailing list

It's better to allow the reporter to send needed info directly to the ML.

Trust me, working with a man in the middle is only going to slow down
diagnosis and fix.

And don't forget "btrfs check --readonly", as on-disk corrupted is also
a very possible reason to lead IO out of chunk mapped ranges.


Just backporting the patch is only to make it a little more graceful,
but won't solve the root problem.

No. In non debug build, ASSERT() is noop so it lead that there is no any
error handle code. There is no RAII/SBRM, unexpected exit will lead the
resources will not be released correctly. If it has unique holder (e.g. mutex),
it will break all other code that need this resource!

Sure, but even with a backport, it would still cause (although graceful)
errors when such IO is triggered.

Hitting the ASSERT() is only a symptom, not the root cause.

I can definitely do a backport immediately, but I'm more interested in
the root cause.
What if it's caused by another hidden bug?

Thanks,
Qu


Thanks,
Qu