On 12/9/24 16:57, Christoph Hellwig wrote: > On Mon, Dec 09, 2024 at 07:57:57AM +0900, Damien Le Moal wrote: >> Avoid this problem by completely getting rid of the need for executing a >> report zone from within the zone write plugging code, instead relying on >> the user either executing a report zones, resetting the zone or >> finishing the zone of a failed write. This is not an unresannable > > s/unresannable/unreasonable/ ? yes. > >> requirement as all well-behaved applications, FSes and device mapper >> already use report zones to recover from write errors whenever possible. > > I think the real question here is what errors the file system (or other > submitter) can even recover from. The next patch deals with the not > support case for a "special" operation, and that's of course a valid one. Yep. But even that one is actually coded in scsi to return a -EIO instead of ENOTSUPP. We can patch that (return ENOTSUPP for an invalid opcode error), but I am not sure if that is safe to do given that this has been like this for ages. This is all to say that we cannot even reliably distinguish special/valid error cases that can be recovered from actual medium/hard errors. > The first patch already excludes EAGAIN from nowait, and the drivers > already retry anything that they think is retryable by just resubmitting > without bubbling it up to the submitter. That mostly leaves fatal > media errors as all modern hardware that supports zones just remaps > on write media failures. I.e. for those the most sane answer is to > simply shut down the file system for single-device file systems, or > treat the device as faulty for multi-device file systems. This might > change when we support logical depop on a per-zone basis, but I don't > think anyone is there yet. We also really should test this case. > I'll add a testcase with error injection for zoned xfs, and someone > should do the same for btrfs (including multi-device handling) and > f2fs. I have test cases for zonefs already. That is because zonefs has the "recover-error" mount option which forces a recovery of a file size (== write pointer position) if a write fails or is torn. The default even for zonefs is to go read-only since there is indeed not much we can do about failed writes. > Sorry for the long rant - not a comment on the code itself but maybe > the commit log could use a little update. OK. Will try to improve it. > Also we probably need to recover this information somewhere in the > docs. Hmmm... not sure we have a good place for this. Let me figure out something. -- Damien Le Moal Western Digital Research