On Wed, Mar 17, 2021 at 10:51:51AM +0100, David Sterba wrote: > From: Qu Wenruo <wqu@xxxxxxxx> > > [ Upstream commit b12de52896c0e8213f70e3a168fde9e6eee95909 ] > > [BUG] > When running btrfs/072 with only one online CPU, it has a pretty high > chance to fail: > > btrfs/072 12s ... _check_dmesg: something found in dmesg (see xfstests-dev/results//btrfs/072.dmesg) > - output mismatch (see xfstests-dev/results//btrfs/072.out.bad) > --- tests/btrfs/072.out 2019-10-22 15:18:14.008965340 +0800 > +++ /xfstests-dev/results//btrfs/072.out.bad 2019-11-14 15:56:45.877152240 +0800 > @@ -1,2 +1,3 @@ > QA output created by 072 > Silence is golden > +Scrub find errors in "-m dup -d single" test > ... > > And with the following call trace: > > BTRFS info (device dm-5): scrub: started on devid 1 > ------------[ cut here ]------------ > BTRFS: Transaction aborted (error -27) > WARNING: CPU: 0 PID: 55087 at fs/btrfs/block-group.c:1890 btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs] > CPU: 0 PID: 55087 Comm: btrfs Tainted: G W O 5.4.0-rc1-custom+ #13 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 > RIP: 0010:btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs] > Call Trace: > __btrfs_end_transaction+0xdb/0x310 [btrfs] > btrfs_end_transaction+0x10/0x20 [btrfs] > btrfs_inc_block_group_ro+0x1c9/0x210 [btrfs] > scrub_enumerate_chunks+0x264/0x940 [btrfs] > btrfs_scrub_dev+0x45c/0x8f0 [btrfs] > btrfs_ioctl+0x31a1/0x3fb0 [btrfs] > do_vfs_ioctl+0x636/0xaa0 > ksys_ioctl+0x67/0x90 > __x64_sys_ioctl+0x43/0x50 > do_syscall_64+0x79/0xe0 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > ---[ end trace 166c865cec7688e7 ]--- > > [CAUSE] > The error number -27 is -EFBIG, returned from the following call chain: > btrfs_end_transaction() > |- __btrfs_end_transaction() > |- btrfs_create_pending_block_groups() > |- btrfs_finish_chunk_alloc() > |- btrfs_add_system_chunk() > > This happens because we have used up all space of > btrfs_super_block::sys_chunk_array. > > The root cause is, we have the following bad loop of creating tons of > system chunks: > > 1. The only SYSTEM chunk is being scrubbed > It's very common to have only one SYSTEM chunk. > 2. New SYSTEM bg will be allocated > As btrfs_inc_block_group_ro() will check if we have enough space > after marking current bg RO. If not, then allocate a new chunk. > 3. New SYSTEM bg is still empty, will be reclaimed > During the reclaim, we will mark it RO again. > 4. That newly allocated empty SYSTEM bg get scrubbed > We go back to step 2, as the bg is already mark RO but still not > cleaned up yet. > > If the cleaner kthread doesn't get executed fast enough (e.g. only one > CPU), then we will get more and more empty SYSTEM chunks, using up all > the space of btrfs_super_block::sys_chunk_array. > > [FIX] > Since scrub/dev-replace doesn't always need to allocate new extent, > especially chunk tree extent, so we don't really need to do chunk > pre-allocation. > > To break above spiral, here we introduce a new parameter to > btrfs_inc_block_group(), @do_chunk_alloc, which indicates whether we > need extra chunk pre-allocation. > > For relocation, we pass @do_chunk_alloc=true, while for scrub, we pass > @do_chunk_alloc=false. > This should keep unnecessary empty chunks from popping up for scrub. > > Also, since there are two parameters for btrfs_inc_block_group_ro(), > add more comment for it. > > Reviewed-by: Filipe Manana <fdmanana@xxxxxxxx> > Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> > Signed-off-by: David Sterba <dsterba@xxxxxxxx> > --- > > There's a report for 5.4 and the patch applies with a minor fixup > without dependencies. > > https://bugzilla.kernel.org/show_bug.cgi?id=210447 Thanks, now queued up. greg k-h