Re: EXT4: kernel BUG at fs/ext4/mballoc.c:1721!

"Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx> · Thu, 3 Sep 2009 16:50:03 +0530

On Thu, Sep 03, 2009 at 01:46:08PM +0530, Sachin Sant wrote:
> While executing FS resize test against ext4 on a 4-way
> POWER6 box with 2.6.31-rc8 kernel ran into following bug.
>
> ------------[ cut here ]------------
> cpu 0x2: Vector: 700 (Program Check) at [c0000000f963ece0]
>    pc: c000000000264d80: .ext4_mb_good_group+0x54/0x15c
>    lr: c00000000026c9b0: .ext4_mb_regular_allocator+0x278/0x44c
>    sp: c0000000f963ef60
>   msr: 8000000000029032
>  current = 0xc000000047b635a0
>  paca    = 0xc000000000b62a00
>    pid   = 32202, comm = dd
> kernel BUG at fs/ext4/mballoc.c:1721!
> enter ? for help
> [link register   ] c00000000026c9b0 .ext4_mb_regular_allocator+0x278/0x44c
> [c0000000f963ef60] c00000000026c99c .ext4_mb_regular_allocator+0x264/0x44c
> (unreliable)
> [c0000000f963f090] c00000000026cde0 .ext4_mb_new_blocks+0x25c/0x5b0
> [c0000000f963f170] c000000000263260 .ext4_ext_get_blocks+0xd18/0xf2c
> [c0000000f963f2f0] c0000000002404a8 .ext4_get_blocks+0x1b8/0x438
> [c0000000f963f3c0] c000000000241d8c .ext4_get_block+0xe8/0x15c
> [c0000000f963f480] c00000000018e1c0 .__block_prepare_write+0x210/0x4b0
> [c0000000f963f5c0] c00000000018e698 .block_write_begin+0xa8/0x13c
> [c0000000f963f680] c000000000243be4 .ext4_write_begin+0x198/0x324
> [c0000000f963f790] c000000000112e50 .generic_file_buffered_write+0x140/0x37c
> [c0000000f963f8d0] c00000000011364c
> .__generic_file_aio_write_nolock+0x37c/0x3e0
> [c0000000f963f9d0] c0000000001140e0 .generic_file_aio_write+0x88/0x120
> [c0000000f963fa90] c000000000239250 .ext4_file_write+0xe4/0x1a4
> [c0000000f963fb40] c00000000015e1f4 .do_sync_write+0xcc/0x130
> [c0000000f963fce0] c00000000015ef44 .vfs_write+0xd0/0x1dc
> [c0000000f963fd80] c00000000015f158 .SyS_write+0x58/0xa0
> [c0000000f963fe30] c000000000008534 syscall_exit+0x0/0x40
> --- Exception: c01 (System Call) at 00000fff8fd1a8f8
> SP (fffc6270e00) is in userspace
>
> During the first 3 runs i did not see this issue, so might
> not be able to recreate this again. I have captured the dmesg
> log and have attached it.
>
> ext4 fs was created and mounted using :
>
> mkfs.ext4 -b 1024 /dev/sda4 3943948
> mount  -t ext4 -o errors=panic,data=journal /dev/sda4 /mnt/tmp/
>
> The corresponding c code is :
>
> 1718         struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb,
> group);
> 1719 1720         BUG_ON(cr < 0 || cr >= 4);
> 1721         BUG_ON(EXT4_MB_GRP_NEED_INIT(grp));
> 1722    ^^^^^^^^^^^^^^^^^^^^
> 1723         free = grp->bb_free;
>
> Thanks
> -Sachin

Can you try this patch ?

commit 43149bc800a6ae88b7d984558403e8d8cb045138
Author: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx>
Date:   Thu Sep 3 16:47:27 2009 +0530

    ext4: check for good group with alloc_sem held
    
    We need to make sure we check for good group with alloc_sem
    held to make sure we prevent a parallel addition of new blocks
    to the group via resize.
    
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx>

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index cd25846..4623555 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2029,13 +2029,6 @@ repeat:
 					goto out;
 			}
 
-			/*
-			 * If the particular group doesn't satisfy our
-			 * criteria we continue with the next group
-			 */
-			if (!ext4_mb_good_group(ac, group, cr))
-				continue;
-
 			err = ext4_mb_load_buddy(sb, group, &e4b);
 			if (err)
 				goto out;
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html