Re: [PATCH] btrfs: workaround the over-confident over-commit available space calculation

Josef Bacik <josef@xxxxxxxxxxxxxx> · Mon, 5 Oct 2020 09:05:57 -0400

On 9/30/20 8:01 AM, Qu Wenruo wrote:
[BUG]
There are quite some bug reports of btrfs falling into a ENOSPC trap,
where btrfs can't even start a transaction to add new devices.

[CAUSE]
Most of the reports are utilize multi-device profiles, like
RAID1/RAID10/RAID5/RAID6, and the involved disks have very unbalanced
sizes.

It turns out that, the overcommit calculation in btrfs_can_overcommit()
is just a factor based calculation, which can't check if devices can
really fulfill the requirement for the desired profile.

This makes btrfs_can_overcommit() to be always over-confident about
usable space, and when we can't allocate any new metadata chunk but
still allow new metadata operations, we fall into the ENOSPC trap and
have no way to exit it.

[WORKAROUND]
The root fix needs a device layout aware, chunk allocator like available
space calculation.

There used to be such patchset submitted to the mail list, but the extra
failure mode is tricky to handle for chunk allocation, thus that
patchset needs more time to mature.

Meanwhile to prevent such problems reaching more users, workaround the
problem by:
- Half the over-commit available space reported
   So that we won't always be that over-confident.
   But this won't really help if we have extremely unbalanced disk size.

- Don't over-commit if the space info is already full
   This may already be too late, but still better than doing nothing and
   believe the over-commit values.

I just had a thought, what if we simply cap the free_chunk_space to the min of 
the free space of all the devices.  Simply walk through all the devices on 
mount, and we do the initial set of whatever the smallest one is.  The rest of 
the math would work out fine, and the rest of the modifications would work fine. 
 The only "tricky" part would be when we do a shrink or grow, we'd have to 
re-calculate the sizes for everybody, but that's not a big deal.  Thanks,

Josef