Re: [PATCH v6 10/28] btrfs: do sequential extent allocation in HMZONED mode

Josef Bacik <josef@xxxxxxxxxxxxxx> · Tue, 17 Dec 2019 14:19:13 -0500

On 12/12/19 11:08 PM, Naohiro Aota wrote:
On HMZONED drives, writes must always be sequential and directed at a block
group zone write pointer position. Thus, block allocation in a block group
must also be done sequentially using an allocation pointer equal to the
block group zone write pointer plus the number of blocks allocated but not
yet written.

Sequential allocation function find_free_extent_zoned() bypass the checks
in find_free_extent() and increase the reserved byte counter by itself. It
is impossible to revert once allocated region in the sequential allocation,
since it might race with other allocations and leave an allocation hole,
which breaks the sequential write rule.

Furthermore, this commit introduce two new variable to struct
btrfs_block_group. "wp_broken" indicate that write pointer is broken (e.g.
not synced on a RAID1 block group) and mark that block group read only.
"zone_unusable" keeps track of the size of once allocated then freed region
in a block group. Such region is never usable until resetting underlying
zones.

This commit also introduce "bytes_zone_unusable" to track such unusable
bytes in a space_info. Pinned bytes are always reclaimed to
"bytes_zone_unusable". They are not usable until resetting them first.

Please separate this out into it's own patch, these things are a bear as it is 
to review, it doesn't help that I need to keep track of two different things per 
patch.  Thanks,

Josef