On Thu, Feb 27, 2025 at 5:56 AM Qu Wenruo <wqu@xxxxxxxx> wrote: > > [BUG] > When testing subpage block size btrfs (block size < page size), I hit > the following spin lock hang on x86_64, with the experimental 2K block > size support: > > <TASK> > _raw_spin_lock_irq+0x2f/0x40 > wait_subpage_spinlock+0x69/0x80 [btrfs] > btrfs_release_folio+0x46/0x70 [btrfs] > folio_unmap_invalidate+0xcb/0x250 > folio_end_writeback+0x127/0x1b0 > btrfs_subpage_clear_writeback+0xef/0x140 [btrfs] > end_bbio_data_write+0x13a/0x3c0 [btrfs] > btrfs_bio_end_io+0x6f/0xc0 [btrfs] > process_one_work+0x156/0x310 > worker_thread+0x252/0x390 > ? __pfx_worker_thread+0x10/0x10 > kthread+0xef/0x250 > ? finish_task_switch.isra.0+0x8a/0x250 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x34/0x50 > ? __pfx_kthread+0x10/0x10 > ret_from_fork_asm+0x1a/0x30 > </TASK> > > [CAUSE] > It's a self deadlock with the following sequence: > > btrfs_subpage_clear_writeback() > |- spin_lock_irqsave(&subpage->lock); > |- folio_end_writeback() > |- folio_end_dropbehind_write() > |- folio_unmap_invalidate() > |- btrfs_release_folio() > |- wait_subpage_spinlock() > |- spin_lock_irq(&subpage->lock); > !! DEADLOCK !! > > We're trying to acquire the same spin lock already held by ourselves. > > [FIX] > Move the folio_end_writeback() call out of the spin lock critical > section. > > And since we no longer have all the bitmap operation and the writeback > flag clearing happening inside the critical section, we must do extra > checks to make sure only the last one clearing the writeback bitmap can > clear the folio writeback flag. > > Fixes: 3470da3b7d87 ("btrfs: subpage: introduce helpers for writeback status") > Cc: stable@xxxxxxxxxxxxxxx # 5.15+ > Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> Reviewed-by: Filipe Manana <fdmanana@xxxxxxxx> Looks good, thanks. > --- > fs/btrfs/subpage.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c > index ebb40f506921..bedb5fac579b 100644 > --- a/fs/btrfs/subpage.c > +++ b/fs/btrfs/subpage.c > @@ -466,15 +466,21 @@ void btrfs_subpage_clear_writeback(const struct btrfs_fs_info *fs_info, > struct btrfs_subpage *subpage = folio_get_private(folio); > unsigned int start_bit = subpage_calc_start_bit(fs_info, folio, > writeback, start, len); > + bool was_writeback; > + bool last = false; > unsigned long flags; > > spin_lock_irqsave(&subpage->lock, flags); > + was_writeback = !subpage_test_bitmap_all_zero(fs_info, folio, writeback); > bitmap_clear(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits); > - if (subpage_test_bitmap_all_zero(fs_info, folio, writeback)) { > + if (subpage_test_bitmap_all_zero(fs_info, folio, writeback) && > + was_writeback) { > ASSERT(folio_test_writeback(folio)); > - folio_end_writeback(folio); > + last = true; > } > spin_unlock_irqrestore(&subpage->lock, flags); > + if (last) > + folio_end_writeback(folio); > } > > void btrfs_subpage_set_ordered(const struct btrfs_fs_info *fs_info, > -- > 2.48.1 > >