On Thu, Sep 03, 2009 at 01:46:08PM +0530, Sachin Sant wrote: > While executing FS resize test against ext4 on a 4-way > POWER6 box with 2.6.31-rc8 kernel ran into following bug. > > ------------[ cut here ]------------ > cpu 0x2: Vector: 700 (Program Check) at [c0000000f963ece0] > pc: c000000000264d80: .ext4_mb_good_group+0x54/0x15c > lr: c00000000026c9b0: .ext4_mb_regular_allocator+0x278/0x44c > sp: c0000000f963ef60 > msr: 8000000000029032 > current = 0xc000000047b635a0 > paca = 0xc000000000b62a00 > pid = 32202, comm = dd > kernel BUG at fs/ext4/mballoc.c:1721! > enter ? for help > [link register ] c00000000026c9b0 .ext4_mb_regular_allocator+0x278/0x44c > [c0000000f963ef60] c00000000026c99c .ext4_mb_regular_allocator+0x264/0x44c > (unreliable) > [c0000000f963f090] c00000000026cde0 .ext4_mb_new_blocks+0x25c/0x5b0 > [c0000000f963f170] c000000000263260 .ext4_ext_get_blocks+0xd18/0xf2c > [c0000000f963f2f0] c0000000002404a8 .ext4_get_blocks+0x1b8/0x438 > [c0000000f963f3c0] c000000000241d8c .ext4_get_block+0xe8/0x15c > [c0000000f963f480] c00000000018e1c0 .__block_prepare_write+0x210/0x4b0 > [c0000000f963f5c0] c00000000018e698 .block_write_begin+0xa8/0x13c > [c0000000f963f680] c000000000243be4 .ext4_write_begin+0x198/0x324 > [c0000000f963f790] c000000000112e50 .generic_file_buffered_write+0x140/0x37c > [c0000000f963f8d0] c00000000011364c > .__generic_file_aio_write_nolock+0x37c/0x3e0 > [c0000000f963f9d0] c0000000001140e0 .generic_file_aio_write+0x88/0x120 > [c0000000f963fa90] c000000000239250 .ext4_file_write+0xe4/0x1a4 > [c0000000f963fb40] c00000000015e1f4 .do_sync_write+0xcc/0x130 > [c0000000f963fce0] c00000000015ef44 .vfs_write+0xd0/0x1dc > [c0000000f963fd80] c00000000015f158 .SyS_write+0x58/0xa0 > [c0000000f963fe30] c000000000008534 syscall_exit+0x0/0x40 > --- Exception: c01 (System Call) at 00000fff8fd1a8f8 > SP (fffc6270e00) is in userspace > > During the first 3 runs i did not see this issue, so might > not be able to recreate this again. I have captured the dmesg > log and have attached it. > > ext4 fs was created and mounted using : > > mkfs.ext4 -b 1024 /dev/sda4 3943948 > mount -t ext4 -o errors=panic,data=journal /dev/sda4 /mnt/tmp/ > > The corresponding c code is : > > 1718 struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, > group); > 1719 1720 BUG_ON(cr < 0 || cr >= 4); > 1721 BUG_ON(EXT4_MB_GRP_NEED_INIT(grp)); > 1722 ^^^^^^^^^^^^^^^^^^^^ > 1723 free = grp->bb_free; > > Thanks > -Sachin Can you try this patch ? commit 43149bc800a6ae88b7d984558403e8d8cb045138 Author: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx> Date: Thu Sep 3 16:47:27 2009 +0530 ext4: check for good group with alloc_sem held We need to make sure we check for good group with alloc_sem held to make sure we prevent a parallel addition of new blocks to the group via resize. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index cd25846..4623555 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2029,13 +2029,6 @@ repeat: goto out; } - /* - * If the particular group doesn't satisfy our - * criteria we continue with the next group - */ - if (!ext4_mb_good_group(ac, group, cr)) - continue; - err = ext4_mb_load_buddy(sb, group, &e4b); if (err) goto out; -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html