On 11/28/24 14:16, Christoph Hellwig wrote: > On Thu, Nov 28, 2024 at 02:07:58PM +0900, Damien Le Moal wrote: >> A bad sector that gets remapped when overwritten is probably the most common, >> and maybe the only one. I need to check again, but I think that for this case, >> the scsi stack retries the reminder of a torn write so we probably do not even >> see it in practice, unless the sector/zone is really dead and cannot be >> recovered. But in that case, no matter what we do, that zone would not be >> writable anymore. > > Yes, all retryable errors should be handled by the drivers. NVMe makes > this very clear with the DNR bit, while SCSI deals with this on a more > ad-hoc basis by looking at the sense codes. So by the time a write error > bubbles up to the file systems I do not expect the device to ever > recover from it. Maybe with some kind of dynamic depop in the future > where we drop just that zone, but otherwise we're very much done. > >> Still trying to see if I can have some sort of synchronization between incoming >> writes and zone wp update to avoid relying on the user doing a report zones. >> That would ensure that emulated zone append always work like the real command. > > I think we're much better off leaving that to the submitter, because > it better have a really good reason to resubmit a write to the zone. > We'll just need to properly document the assumptions. Sounds good. What do you think of adding the opportunistic "update zone wp" whenever we execute a user report zones ? It is very easy to do and should not slow down significantly report zones itself because we usually have very zone write plugs and the hash search is fast. -- Damien Le Moal Western Digital Research