Patch "btrfs: transaction: Avoid deadlock due to bad initialization timing of fs_info::journal_info" has been added to the 5.6-stable tree

Sasha Levin <sashal@xxxxxxxxxx> · Wed, 06 May 2020 19:57:15 -0400

This is a note to let you know that I've just added the patch titled

    btrfs: transaction: Avoid deadlock due to bad initialization timing of fs_info::journal_info

to the 5.6-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     btrfs-transaction-avoid-deadlock-due-to-bad-initiali.patch
and it can be found in the queue-5.6 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 81f253a43918c598959abb71f5b0cffc1a7892d1
Author: Qu Wenruo <wqu@xxxxxxxx>
Date:   Mon Apr 27 14:50:14 2020 +0800

    btrfs: transaction: Avoid deadlock due to bad initialization timing of fs_info::journal_info
    
    commit fcc99734d1d4ced30167eb02e17f656735cb9928 upstream.
    
    [BUG]
    One run of btrfs/063 triggered the following lockdep warning:
      ============================================
      WARNING: possible recursive locking detected
      5.6.0-rc7-custom+ #48 Not tainted
      --------------------------------------------
      kworker/u24:0/7 is trying to acquire lock:
      ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
    
      but task is already holding lock:
      ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
    
      other info that might help us debug this:
       Possible unsafe locking scenario:
    
             CPU0
             ----
        lock(sb_internal#2);
        lock(sb_internal#2);
    
       *** DEADLOCK ***
    
       May be due to missing lock nesting notation
    
      4 locks held by kworker/u24:0/7:
       #0: ffff88817b495948 ((wq_completion)btrfs-endio-write){+.+.}, at: process_one_work+0x557/0xb80
       #1: ffff888189ea7db8 ((work_completion)(&work->normal_work)){+.+.}, at: process_one_work+0x557/0xb80
       #2: ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
       #3: ffff888174ca4da8 (&fs_info->reloc_mutex){+.+.}, at: btrfs_record_root_in_trans+0x83/0xd0 [btrfs]
    
      stack backtrace:
      CPU: 0 PID: 7 Comm: kworker/u24:0 Not tainted 5.6.0-rc7-custom+ #48
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
      Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
      Call Trace:
       dump_stack+0xc2/0x11a
       __lock_acquire.cold+0xce/0x214
       lock_acquire+0xe6/0x210
       __sb_start_write+0x14e/0x290
       start_transaction+0x66c/0x890 [btrfs]
       btrfs_join_transaction+0x1d/0x20 [btrfs]
       find_free_extent+0x1504/0x1a50 [btrfs]
       btrfs_reserve_extent+0xd5/0x1f0 [btrfs]
       btrfs_alloc_tree_block+0x1ac/0x570 [btrfs]
       btrfs_copy_root+0x213/0x580 [btrfs]
       create_reloc_root+0x3bd/0x470 [btrfs]
       btrfs_init_reloc_root+0x2d2/0x310 [btrfs]
       record_root_in_trans+0x191/0x1d0 [btrfs]
       btrfs_record_root_in_trans+0x90/0xd0 [btrfs]
       start_transaction+0x16e/0x890 [btrfs]
       btrfs_join_transaction+0x1d/0x20 [btrfs]
       btrfs_finish_ordered_io+0x55d/0xcd0 [btrfs]
       finish_ordered_fn+0x15/0x20 [btrfs]
       btrfs_work_helper+0x116/0x9a0 [btrfs]
       process_one_work+0x632/0xb80
       worker_thread+0x80/0x690
       kthread+0x1a3/0x1f0
       ret_from_fork+0x27/0x50
    
    It's pretty hard to reproduce, only one hit so far.
    
    [CAUSE]
    This is because we're calling btrfs_join_transaction() without re-using
    the current running one:
    
    btrfs_finish_ordered_io()
    |- btrfs_join_transaction()             <<< Call #1
       |- btrfs_record_root_in_trans()
          |- btrfs_reserve_extent()
             |- btrfs_join_transaction()    <<< Call #2
    
    Normally such btrfs_join_transaction() call should re-use the existing
    one, without trying to re-start a transaction.
    
    But the problem is, in btrfs_join_transaction() call #1, we call
    btrfs_record_root_in_trans() before initializing current::journal_info.
    
    And in btrfs_join_transaction() call #2, we're relying on
    current::journal_info to avoid such deadlock.
    
    [FIX]
    Call btrfs_record_root_in_trans() after we have initialized
    current::journal_info.
    
    CC: stable@xxxxxxxxxxxxxxx # 4.4+
    Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
    Reviewed-by: David Sterba <dsterba@xxxxxxxx>
    Signed-off-by: David Sterba <dsterba@xxxxxxxx>
    Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index beb6c69cd1e55..a209e2ef547f4 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -660,10 +660,19 @@ again:
 	}
 
 got_it:
-	btrfs_record_root_in_trans(h, root);
-
 	if (!current->journal_info)
 		current->journal_info = h;
+
+	/*
+	 * btrfs_record_root_in_trans() needs to alloc new extents, and may
+	 * call btrfs_join_transaction() while we're also starting a
+	 * transaction.
+	 *
+	 * Thus it need to be called after current->journal_info initialized,
+	 * or we can deadlock.
+	 */
+	btrfs_record_root_in_trans(h, root);
+
 	return h;
 
 join_fail: