Re: [blktests] zbd/012: Test requeuing of zoned writes and queue freezing

Damien Le Moal <dlemoal@xxxxxxxxxx> · Thu, 28 Nov 2024 13:37:46 +0900

On 11/28/24 12:20, Christoph Hellwig wrote:
> On Thu, Nov 28, 2024 at 08:18:16AM +0900, Damien Le Moal wrote:
>> The BIO that failed is not recovered. The user will see the failure. The error
>> recovery report zones is all about avoiding more failures of plugged zone append
>> BIOs behind that failed BIO. These can succeed with the error recovery.
>>
>> So sure, we can fail all BIOs. The user will see more failures. If that is OK,
>> that's easy to do. But in the end, that is not a solution because we still need
> 
> What is the scenario where only one I/O will fail?  The time of dust on
> a sector failign writes to just sector are long gone these days.
> 
> So unless we can come up with a scenario where:
> 
>  - one I/O will fail, but others won't
>  - this matters to the writer
> 
> optimizing for being able to just fail a single I/O seems like a
> wasted effort.
> 
>> to get an updated zone write pointer to be able to restart zone append
>> emulation. Otherwise, we are in the dark and will not know where to send the
>> regular writes emulating zone append. That means that we still need to issue a
>> zone report and that is racing with queue freeze and reception of a new BIO. We
>> cannot have new BIOs "wait" for the zone report as that would create a hang
>> situation again if a queue freeze is started between reception of the new BIO
>> and the zone report. Do we fail these new BIOs too ? That seems extreme.
> 
> Just add a "need resync" flag and do the report zones before issuing the
> next write?

The problem here would be that "before issuing the next write" needs to be
really before we do a blk_queue_enter() for that write, so that would need to be
on entry to blk_mq_submit_bio() or before in the stack. Otherwise, we endup with
the write bio completing depending on the report zones, again, and the potential
hang is back.

But I have something now that completely remove report zones. Same idea as you
suggested: a BLK_ZONE_WPLUG_NEED_WP_UPDATE flag that is set on error and an
automatic update of the zone write plug to the start sector of a bio when we
start seeing writes again for that zone. The idea is that well-behaved users
will do a report zone after a failed write and restart writing at the correct
position.

And for good measures, I modified report zones to also automatically update the
wp of zones that have BLK_ZONE_WPLUG_NEED_WP_UPDATE. So the user doing a report
zones clears everything up.

Overall, that removes *a lot* of code and makes things a lot simpler. Starting
test runs with that now.

-- 
Damien Le Moal
Western Digital Research