On 11/28/24 12:20, Christoph Hellwig wrote: > On Thu, Nov 28, 2024 at 08:18:16AM +0900, Damien Le Moal wrote: >> The BIO that failed is not recovered. The user will see the failure. The error >> recovery report zones is all about avoiding more failures of plugged zone append >> BIOs behind that failed BIO. These can succeed with the error recovery. >> >> So sure, we can fail all BIOs. The user will see more failures. If that is OK, >> that's easy to do. But in the end, that is not a solution because we still need > > What is the scenario where only one I/O will fail? The time of dust on > a sector failign writes to just sector are long gone these days. > > So unless we can come up with a scenario where: > > - one I/O will fail, but others won't > - this matters to the writer > > optimizing for being able to just fail a single I/O seems like a > wasted effort. > >> to get an updated zone write pointer to be able to restart zone append >> emulation. Otherwise, we are in the dark and will not know where to send the >> regular writes emulating zone append. That means that we still need to issue a >> zone report and that is racing with queue freeze and reception of a new BIO. We >> cannot have new BIOs "wait" for the zone report as that would create a hang >> situation again if a queue freeze is started between reception of the new BIO >> and the zone report. Do we fail these new BIOs too ? That seems extreme. > > Just add a "need resync" flag and do the report zones before issuing the > next write? The problem here would be that "before issuing the next write" needs to be really before we do a blk_queue_enter() for that write, so that would need to be on entry to blk_mq_submit_bio() or before in the stack. Otherwise, we endup with the write bio completing depending on the report zones, again, and the potential hang is back. But I have something now that completely remove report zones. Same idea as you suggested: a BLK_ZONE_WPLUG_NEED_WP_UPDATE flag that is set on error and an automatic update of the zone write plug to the start sector of a bio when we start seeing writes again for that zone. The idea is that well-behaved users will do a report zone after a failed write and restart writing at the correct position. And for good measures, I modified report zones to also automatically update the wp of zones that have BLK_ZONE_WPLUG_NEED_WP_UPDATE. So the user doing a report zones clears everything up. Overall, that removes *a lot* of code and makes things a lot simpler. Starting test runs with that now. -- Damien Le Moal Western Digital Research