Hi, I got the following crash from ext4, although I got a partially dumped vmcore, I'm not able to find out the root cause, any ideas are appreciated. The kernel is 4.9.0. [943102.751128] EXT4-fs error (device nvme0n1p1): ext4_wait_block_bitmap:503: comm fsabc: Cannot read block bitmap - block_group = 8383, block_bitmap = 274202639 [943102.751131] EXT4-fs (nvme0n1p1): previous I/O error to superblock detected [943102.751134] Buffer I/O error on dev nvme0n1p1, logical block 0, lost sync page write [943102.751298] ------------[ cut here ]------------ [943102.751299] kernel BUG at fs/ext4/mballoc.c:1911! [943102.751300] invalid opcode: 0000 [#1] SMP ... [943102.751316] task: ffff887be5fabe00 task.stack: ffffc90037994000 [943102.751339] RIP: 0010:[<ffffffffa05336dc>] [<ffffffffa05336dc>] ext4_mb_simple_scan_group+0x14c/0x160 [ext4] [943102.751340] RSP: 0018:ffffc90037997820 EFLAGS: 00010246 [943102.751340] RAX: 0000000000000008 RBX: 000000000000000c RCX: 0000000000000008 [943102.751341] RDX: 0000000000000040 RSI: 0000000000000038 RDI: ffff88634047aff8 [943102.751342] RBP: ffffc90037997860 R08: ffff000000000000 R09: 000015f80607a517 [943102.751342] R10: ffff887e3f89c000 R11: ffff886ff7a54928 R12: ffff887d002cde80 [943102.751343] R13: ffff887e3cabc000 R14: ffffc90037997898 R15: 0000000000000030 [943102.751344] FS: 00007f1da6eff700(0000) GS:ffff887e7e680000(0000) knlGS:0000000000000000 [943102.751345] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [943102.751345] CR2: 00007f1d78008708 CR3: 0000007be3012000 CR4: 00000000007606f0 [943102.751346] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [943102.751347] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [943102.751347] PKRU: 55555554 [943102.751347] Stack: [943102.751350] 000000080000417f 0000000800000001 6a0d0bf01fb48b22 ffff887d002cde80 [943102.751351] 00000000000020bf 0000000000000000 0000000000000000 ffff887e3cabc000 [943102.751352] ffffc90037997900 ffffffffa05340a6 00000000379978a0 ffff887e3cabd000 [943102.751353] Call Trace: [943102.751363] [<ffffffffa05340a6>] ext4_mb_regular_allocator+0x356/0x460 [ext4] [943102.751371] [<ffffffffa0535b9c>] ext4_mb_new_blocks+0x5ec/0xaf0 [ext4] [943102.751379] [<ffffffffa052434d>] ? __read_extent_tree_block+0x5d/0x1f0 [ext4] [943102.751386] [<ffffffffa0525633>] ? ext4_find_extent+0x143/0x2d0 [ext4] [943102.751394] [<ffffffffa052a7de>] ext4_ext_map_blocks+0xb5e/0xf30 [ext4] [943102.751397] [<ffffffff811bb75c>] ? node_dirty_ok+0x12c/0x170 [943102.751403] [<ffffffffa04f7802>] ext4_map_blocks+0x172/0x600 [ext4] [943102.751406] [<ffffffff8127a8c1>] ? alloc_buffer_head+0x21/0x60 [943102.751407] [<ffffffff81233601>] ? mem_cgroup_commit_charge+0x91/0x530 [943102.751413] [<ffffffffa04f7d22>] _ext4_get_block+0x92/0x100 [ext4] [943102.751419] [<ffffffffa04f7da6>] ext4_get_block+0x16/0x20 [ext4] [943102.751420] [<ffffffff8127d357>] __block_write_begin_int+0x197/0x5e0 [943102.751425] [<ffffffffa04f7d90>] ? _ext4_get_block+0x100/0x100 [ext4] [943102.751432] [<ffffffffa04fcb56>] ? ext4_write_begin+0x126/0x5b0 [ext4] [943102.751433] [<ffffffff8127d7b1>] __block_write_begin+0x11/0x20 [943102.751439] [<ffffffffa04fcbdc>] ext4_write_begin+0x1ac/0x5b0 [ext4] [943102.751446] [<ffffffffa052cfdd>] ? __ext4_journal_stop+0x3d/0xa0 [ext4] [943102.751449] [<ffffffff811ab578>] generic_perform_write+0xc8/0x1c0 [943102.751451] [<ffffffff8125f37e>] ? file_update_time+0x5e/0x110 [943102.751452] [<ffffffff811adbb5>] __generic_file_write_iter+0x185/0x1d0 [943102.751458] [<ffffffffa04f1a6b>] ext4_file_write_iter+0x8b/0x380 [ext4] [943102.751460] [<ffffffff81247429>] ? vfs_getattr_nosec+0x29/0x40 [943102.751462] [<ffffffff81247c6f>] ? cp_new_stat+0x14f/0x180 [943102.751463] [<ffffffff81241115>] __vfs_write+0xe5/0x160 [943102.751464] [<ffffffff812423b5>] vfs_write+0xb5/0x1a0 [943102.751465] [<ffffffff81243875>] SyS_write+0x55/0xc0 [943102.751468] [<ffffffff8171a6da>] entry_SYSCALL_64_fastpath+0x1a/0xc5 [943102.751476] Code: 39 44 24 3c 75 27 49 8b 85 60 <0f> 0b 0f 0b e8 3b 57 b5 e0 90 66 [943102.751484] RIP [<ffffffffa05336dc>] ext4_mb_simple_scan_group+0x14c/0x160 [ext4] [943102.751484] RSP <ffffc90037997820> Here is my diagnosis so far, The code of interest is --------------------------- static noinline_for_stack void ext4_mb_simple_scan_group(struct ext4_allocation_context *ac, struct ext4_buddy *e4b) { struct super_block *sb = ac->ac_sb; struct ext4_group_info *grp = e4b->bd_info; void *buddy; int i; int k; int max; BUG_ON(ac->ac_2order <= 0); for (i = ac->ac_2order; i <= sb->s_blocksize_bits + 1; i++) { if (grp->bb_counters[i] == 0) continue; buddy = mb_find_buddy(e4b, i, &max); BUG_ON(buddy == NULL); k = mb_find_next_zero_bit(buddy, max, 0); BUG_ON(k >= max); ac->ac_found++; ac->ac_b_ex.fe_len = 1 << i; ac->ac_b_ex.fe_start = k << i; ac->ac_b_ex.fe_group = e4b->bd_group; ------------------------------------- (a) ac->ac_2order = 11 as crash> grep ac_2order ac.txt ac_2order = 11 '\v', (b) ac_b_ex.fe_start = 0, so order = 11 has been skipped in continue; crash> grep ac_b_ex -A 5 ac.txt ac_b_ex = { fe_logical = 2048, fe_start = 0, fe_group = 0, fe_len = 0 }, (c) grp->bb_counters[11] = 0, so we head to i = 12 (d) Inside the buddy bitmap in memory, 'order=12' has 1 free block, i.e. bb_counters[12] = 1. ----------- crash> grep bb_counter grp.txt bb_counters = 0xffff887e101a9648 crash> rd 0xffff887e101a9648 -32 16 ffff887e101a9648: 00000013 fffffe21 ffffff33 ffffffbb ....!...3....... ffff887e101a9658: fffffffa fffffff8 00000007 00000002 ................ ffff887e101a9668: 0000000a 00000005 00000007 00000000 ................ ffff887e101a9678: 00000001 00000001 00000000 00000000 ................ ----------- But... k = mb_find_next_zero_bit(buddy, max, 0); BUG_ON(k >= max) --> kernel BUG at fs/ext4/mballoc.c:1911! k = max = 8, the buddy bitmap shows every bit is 1, i.e. no free block is available. (e) there're errors about reading this bitmap(group 8383) shown in the log, crash> grep group e4b.txt bd_group = 8383 however when it comes to BUG_ON(k >= max), reading this bitmap has been successful, and it is the inconsistence between ->bb_counters and the buddy bitmap that ends up with the crash, but if the buddy bitmap was regenerated, bb_counters should match with the buddy bitmap. Anyway, can anyone take a look at it, please? thanks, -liubo