On Mon 11-12-23 19:25:44, Ye Bin wrote: > There's issue when do io test: > WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170] > CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G OE > Call trace: > dump_backtrace+0x0/0x1a0 > show_stack+0x24/0x30 > dump_stack+0xb0/0x100 > watchdog_timer_fn+0x254/0x3f8 > __hrtimer_run_queues+0x11c/0x380 > hrtimer_interrupt+0xfc/0x2f8 > arch_timer_handler_phys+0x38/0x58 > handle_percpu_devid_irq+0x90/0x248 > generic_handle_irq+0x3c/0x58 > __handle_domain_irq+0x68/0xc0 > gic_handle_irq+0x90/0x320 > el1_irq+0xcc/0x180 > queued_spin_lock_slowpath+0x1d8/0x320 > jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2] > kjournald2+0xec/0x2f0 [jbd2] > kthread+0x134/0x138 > ret_from_fork+0x10/0x18 > > Analyzed informations from vmcore as follows: > (1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list'; > (2) Now is processing the 855th jbd2_inode; > (3) JBD2 task has TIF_NEED_RESCHED flag; > (4) There's no pags in address_space around the 855th jbd2_inode; > (5) There are some process is doing drop caches; > (6) Mounted with 'nodioread_nolock' option; > (7) 128 CPUs; > > According to informations from vmcore we know 'journal->j_list_lock' spin lock > competition is fierce. So journal_finish_inode_data_buffers() maybe process > slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors(). > However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK, > will not call cond_resched(). So may lead to soft lockup. > journal_finish_inode_data_buffers > filemap_fdatawait_range_keep_errors > __filemap_fdatawait_range > while (index <= end) > nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK); > if (!nr_pages) > break; --> If 'nr_pages' is equal zero will break, then will not call cond_resched() > for (i = 0; i < nr_pages; i++) > wait_on_page_writeback(page); > cond_resched(); > > To solve above issue, add scheduling point in the journal_finish_inode_data_buffers(); > > Signed-off-by: Ye Bin <yebin10@xxxxxxxxxx> Makes sense. Feel free to add: Reviewed-by: Jan Kara <jack@xxxxxxx> Honza > --- > fs/jbd2/commit.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c > index 9bdb377a348f..5e122586e06e 100644 > --- a/fs/jbd2/commit.c > +++ b/fs/jbd2/commit.c > @@ -270,6 +270,7 @@ static int journal_finish_inode_data_buffers(journal_t *journal, > if (!ret) > ret = err; > } > + cond_resched(); > spin_lock(&journal->j_list_lock); > jinode->i_flags &= ~JI_COMMIT_RUNNING; > smp_mb(); > -- > 2.31.1 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR