ping ? to Jan Kara On 2020/3/12 10:38, lishan wrote: > Hi, Jan Kara: > > I have a long-standing problem, > In the ocfs2 file system, there is a panic problem related to b_committed_data, > details can be found in the historical mail. > > Refer to the old version of the ext3 file system, > ocfs2 file system uses the undo process to ensure that metadata > which is not committed to disk is not reused. > > I understand that for transactions that have not been committed, > b_committed_data will not be NULL, so, > in what scenario do you think b_committed_data will be set to NULL? > > Thanks, > Shan > > On 2020/3/11 9:15, Joseph Qi wrote: >> >> >> On 2020/3/9 16:26, lishan wrote: >>> A NULL pointer panic dereference in ocfs2_block_group_clear_bits() happen again. >>> The information of NULL pointer stack as follows: >>> >>> PID: 81866 TASK: ffffa07c3c21ae80 CPU: 66 COMMAND: "fallocate" >>> #0 [ffff0000b4d6b0b0] machine_kexec at ffff0000800a2954 >>> #1 [ffff0000b4d6b110] __crash_kexec at ffff0000801bab34 >>> #2 [ffff0000b4d6b2a0] panic at ffff0000800f02cc >>> #3 [ffff0000b4d6b380] die at ffff00008008f6ac >>> #4 [ffff0000b4d6b3d0] bug_handler at ffff00008008f744 >>> #5 [ffff0000b4d6b400] brk_handler at ffff000080085d1c >>> #6 [ffff0000b4d6b420] do_debug_exception at ffff000080081194 >>> #7 [ffff0000b4d6b630] el1_dbg at ffff00008008332c >>> PC: ffff00000190e9c0 [_ocfs2_free_suballoc_bits+1608] >>> LR: ffff00000190e990 [_ocfs2_free_suballoc_bits+1560] >>> SP: ffff0000b4d6b640 PSTATE: 60400009 >>> X29: ffff0000b4d6b650 X28: 0000000000000000 X27: 00000000000052f3 >>> X26: ffff807c511a9570 X25: ffff807ca0054000 X24: 00000000000052f2 >>> X23: 0000000000000001 X22: ffff807c7cde7a90 X21: ffff0000811d9000 >>> X20: ffff807c5e7d2000 X19: ffff00000190c768 X18: 0000000000000000 >>> X17: 0000000000000000 X16: ffff000080a032f0 X15: 0000000000000000 >>> X14: ffffffffffffffff X13: fffffffffffffff7 X12: ffffffffffffffff >>> X11: 0000000000000038 X10: 0101010101010101 X9: ffffffffffffffff >>> X8: 7f7f7f7f7f7f7f7f X7: 0000000000000000 X6: 0000000000000080 >>> X5: 0000000000000000 X4: 0000000000000002 X3: ffff00000199f390 >>> X2: a603c08321456e00 X1: ffff807c7cde7a90 X0: 0000000000000000 >>> #8 [ffff0000b4d6b650] _ocfs2_free_suballoc_bits at ffff00000190e9bc [ocfs2] >>> #9 [ffff0000b4d6b710] _ocfs2_free_clusters at ffff0000019110d4 [ocfs2] >>> #10 [ffff0000b4d6b790] ocfs2_free_clusters at ffff000001913e94 [ocfs2] >>> #11 [ffff0000b4d6b7d0] __ocfs2_flush_truncate_log at ffff0000018b5294 [ocfs2] >>> #12 [ffff0000b4d6b8a0] ocfs2_remove_btree_range at ffff0000018bb34c [ocfs2] >>> #13 [ffff0000b4d6b960] ocfs2_commit_truncate at ffff0000018bc76c [ocfs2] >>> #14 [ffff0000b4d6ba60] ocfs2_wipe_inode at ffff0000018e57bc [ocfs2] >>> #15 [ffff0000b4d6bb00] ocfs2_evict_inode at ffff0000018e5db8 [ocfs2] >>> #16 [ffff0000b4d6bb70] evict at ffff000080365040 >>> #17 [ffff0000b4d6bba0] iput at ffff0000803655d8 >>> #18 [ffff0000b4d6bbe0] ocfs2_dentry_iput at ffff0000018c60a0 [ocfs2] >>> #19 [ffff0000b4d6bc30] dentry_unlink_inode at ffff00008035ef58 >>> #20 [ffff0000b4d6bc50] __dentry_kill at ffff000080360384 >>> #21 [ffff0000b4d6bc80] dentry_kill at ffff000080360670 >>> #22 [ffff0000b4d6bcb0] dput at ffff00008036093c >>> #23 [ffff0000b4d6bcf0] __fput at ffff000080343930 >>> #24 [ffff0000b4d6bd40] ____fput at ffff000080343aac >>> #25 [ffff0000b4d6bd60] task_work_run at ffff0000801172fc >>> >>> The direct panic reason is that bh2jh (group_bh)-> b_committed_data is null. >>> It is presumed that the network was disconnected during the write process, >>> causing the transaction abort. as follows: >>> jbd2_journal_abort >>> ....... >>> jbd2_journal_commit_transaction >>> jh->b_committed_data = NULL; >>> >>> _ocfs2_free_suballoc_bits >>> ocfs2_block_group_clear_bits >>> // undo_bg is now set to null >>> BUG_ON(!undo_bg); >>> >>> When applying for free space, if b_committed_data is null, >>> it will be directly occupied, as follows: >>> ocfs2_cluster_group_search >>> ocfs2_block_group_find_clear_bits >>> ocfs2_test_bg_bit_allocatable: >>> bg = (struct ocfs2_group_desc *) bh2jh(bg_bh)->b_committed_data; >>> if (bg) >>> ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); >>> else >>> ret = 1; >>> b_committed_data is an intermediate state backup for bitmap transaction commits, >>> newly applied space can overwrite previous dirty data, >>> so, I think, while free clusters, if b_committed_data is null, ignore it. >>> Host panic directly, too violent. >>> >>> Signed-off-by: Shan Li <lishan24@xxxxxxxxxx> >>> Reviewed-by: Jun Piao <piaojun@xxxxxxxxxx> >>> --- >>> fs/ocfs2/suballoc.c | 9 +++++++-- >>> 1 file changed, 7 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c >>> index 939df99d2dec..aaf1b3cbd984 100644 >>> --- a/fs/ocfs2/suballoc.c >>> +++ b/fs/ocfs2/suballoc.c >>> @@ -2412,14 +2412,19 @@ static int ocfs2_block_group_clear_bits(handle_t *handle, >>> if (undo_fn) { >>> spin_lock(&jh->b_state_lock); >>> undo_bg = (struct ocfs2_group_desc *) jh->b_committed_data; >>> - BUG_ON(!undo_bg); >>> + if (!undo_bg) >>> + mlog(ML_NOTICE, "%s: group descriptor # %llu (device %s) journal " >>> + "b_committed_data had been cleared.\n", >>> + OCFS2_SB(alloc_inode->i_sb)->uuid_str, >>> + (unsigned long long)le64_to_cpu(bg->bg_blkno), >>> + alloc_inode->i_sb->s_id); >> >> Seems a kind of workaround. >> I am worrying about other abnormal cases of NULL b_committed_data, it >> may lead to a corrupt filesystem. >> So how about isolating the journal abort case? >> >> Thanks, >> Joseph >> >>> } >>> >>> tmp = num_bits; >>> while(tmp--) { >>> ocfs2_clear_bit((bit_off + tmp), >>> (unsigned long *) bg->bg_bitmap); >>> - if (undo_fn) >>> + if (undo_fn && undo_bg) >>> undo_fn(bit_off + tmp, >>> (unsigned long *) undo_bg->bg_bitmap); >>> } >>> >> >> . >>