Patch "jbd2: fix soft lockup in journal_finish_inode_data_buffers()" has been added to the 5.10-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    jbd2: fix soft lockup in journal_finish_inode_data_buffers()

to the 5.10-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     jbd2-fix-soft-lockup-in-journal_finish_inode_data_bu.patch
and it can be found in the queue-5.10 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 1b9ccd568848ee62f246107b2f5b269e3d8173a6
Author: Ye Bin <yebin10@xxxxxxxxxx>
Date:   Mon Dec 11 19:25:44 2023 +0800

    jbd2: fix soft lockup in journal_finish_inode_data_buffers()
    
    [ Upstream commit 6c02757c936063f0631b4e43fe156f8c8f1f351f ]
    
    There's issue when do io test:
    WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170]
    CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G  OE
    Call trace:
     dump_backtrace+0x0/0x1a0
     show_stack+0x24/0x30
     dump_stack+0xb0/0x100
     watchdog_timer_fn+0x254/0x3f8
     __hrtimer_run_queues+0x11c/0x380
     hrtimer_interrupt+0xfc/0x2f8
     arch_timer_handler_phys+0x38/0x58
     handle_percpu_devid_irq+0x90/0x248
     generic_handle_irq+0x3c/0x58
     __handle_domain_irq+0x68/0xc0
     gic_handle_irq+0x90/0x320
     el1_irq+0xcc/0x180
     queued_spin_lock_slowpath+0x1d8/0x320
     jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2]
     kjournald2+0xec/0x2f0 [jbd2]
     kthread+0x134/0x138
     ret_from_fork+0x10/0x18
    
    Analyzed informations from vmcore as follows:
    (1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list';
    (2) Now is processing the 855th jbd2_inode;
    (3) JBD2 task has TIF_NEED_RESCHED flag;
    (4) There's no pags in address_space around the 855th jbd2_inode;
    (5) There are some process is doing drop caches;
    (6) Mounted with 'nodioread_nolock' option;
    (7) 128 CPUs;
    
    According to informations from vmcore we know 'journal->j_list_lock' spin lock
    competition is fierce. So journal_finish_inode_data_buffers() maybe process
    slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors().
    However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK,
    will not call cond_resched(). So may lead to soft lockup.
    journal_finish_inode_data_buffers
      filemap_fdatawait_range_keep_errors
        __filemap_fdatawait_range
          while (index <= end)
            nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK);
            if (!nr_pages)
               break;    --> If 'nr_pages' is equal zero will break, then will not call cond_resched()
            for (i = 0; i < nr_pages; i++)
              wait_on_page_writeback(page);
            cond_resched();
    
    To solve above issue, add scheduling point in the journal_finish_inode_data_buffers();
    
    Signed-off-by: Ye Bin <yebin10@xxxxxxxxxx>
    Reviewed-by: Jan Kara <jack@xxxxxxx>
    Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@xxxxxxxxxx
    Signed-off-by: Theodore Ts'o <tytso@xxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index fa24b407a9dc..db137671a41f 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -300,6 +300,7 @@ static int journal_finish_inode_data_buffers(journal_t *journal,
 			if (!ret)
 				ret = err;
 		}
+		cond_resched();
 		spin_lock(&journal->j_list_lock);
 		jinode->i_flags &= ~JI_COMMIT_RUNNING;
 		smp_mb();




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux