Re: [PATCH v6 15/28] btrfs: serialize data allocation and submit IOs

Josef Bacik <josef@xxxxxxxxxxxxxx> · Tue, 17 Dec 2019 14:49:44 -0500

On 12/12/19 11:09 PM, Naohiro Aota wrote:
To preserve sequential write pattern on the drives, we must serialize
allocation and submit_bio. This commit add per-block group mutex
"zone_io_lock" and find_free_extent_zoned() hold the lock. The lock is kept
even after returning from find_free_extent(). It is released when submiting
IOs corresponding to the allocation is completed.

Implementing such behavior under __extent_writepage_io() is almost
impossible because once pages are unlocked we are not sure when submiting
IOs for an allocated region is finished or not. Instead, this commit add
run_delalloc_hmzoned() to write out non-compressed data IOs at once using
extent_write_locked_rage(). After the write, we can call
btrfs_hmzoned_data_io_unlock() to unlock the block group for new
allocation.

Signed-off-by: Naohiro Aota <naohiro.aota@xxxxxxx>

Have you actually tested these patches with lock debugging on?  The 
submit_compressed_extents stuff is async, so the unlocker owner will not be the 
lock owner, and that'll make all sorts of things blow up.  This is just straight 
up broken.

I would really rather see a hmzoned block scheduler that just doesn't submit the 
bio's until they are aligned with the WP, that way this intellligence doesn't 
have to be dealt with at the file system layer.  I get allocating in line with 
the WP, but this whole forcing us to allocate and submit the bio in lock step is 
just nuts, and broken in your subsequent patches.  This whole approach needs to 
be reworked.  Thanks,

Josef