On Mon, Dec 09, 2024 at 09:23:56PM +0900, Damien Le Moal wrote: > The zone reclaim processing of the dm-zoned device mapper uses > blkdev_issue_zeroout() to align the write pointer of a zone being used > for reclaiming another zone, to write the valid data blocks from the > zone being reclaimed at the same position relative to the zone start in > the reclaim target zone. > > The first call to blkdev_issue_zeroout() will try to use hardware > offload using a REQ_OP_WRITE_ZEROES operation if the device reports a > non-zero max_write_zeroes_sectors queue limit. If this operation fails > because of the lack of hardware support, blkdev_issue_zeroout() falls > back to using a regular write operation with the zero-page as buffer. > Currently, such REQ_OP_WRITE_ZEROES failure is automatically handled by > the block layer zone write plugging code which will execute a report > zones operation to ensure that the write pointer of the target zone of > the failed operation has not changed and to "rewind" the zone write > pointer offset of the target zone as it was advanced when the write zero > operation was submitted. So the REQ_OP_WRITE_ZEROES failure does not > cause any issue and blkdev_issue_zeroout() works as expected. > > However, since the automatic recovery of zone write pointers by the zone > write plugging code can potentially cause deadlocks with queue freeze > operations, a different recovery must be implemented in preparation for > the removal of zone write plugging report zones based recovery. > > Do this by introducing the new function blk_zone_issue_zeroout(). This > function first calls blkdev_issue_zeroout() with the flag > BLKDEV_ZERO_NOFALLBACK to intercept failures on the first execution > which attempt to use the device hardware offload with the > REQ_OP_WRITE_ZEROES operation. If this attempt fails, a report zone > operation is issued to restore the zone write pointer offset of the > target zone to the correct position and blkdev_issue_zeroout() is called > again without the BLKDEV_ZERO_NOFALLBACK flag. The report zones > operation performing this recovery is implemented using the helper > function disk_zone_sync_wp_offset() which calls the gendisk report_zones > file operation with the callback disk_report_zones_cb(). This callback > updates the target write pointer offset of the target zone using the new > function disk_zone_wplug_sync_wp_offset(). > > dmz_reclaim_align_wp() is modified to change its call to > blkdev_issue_zeroout() to a call to blk_zone_issue_zeroout() without any > other change needed as the two functions are functionnally equivalent. > > Fixes: dd291d77cc90 ("block: Introduce zone write plugging") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Damien Le Moal <dlemoal@xxxxxxxxxx> Acked-by: Mike Snitzer <snitzer@xxxxxxxxxx>