On 7/11/23 03:01, Bart Van Assche wrote: > From ZBC-2: "The device server terminates with CHECK CONDITION status, with > the sense key set to ILLEGAL REQUEST, and the additional sense code set to > UNALIGNED WRITE COMMAND a write command, other than an entire medium write > same command, that specifies: a) the starting LBA in a sequential write > required zone set to a value that is not equal to the write pointer for that > sequential write required zone; or b) an ending LBA that is not equal to the > last logical block within a physical block (see SBC-5)." > > I am not aware of any other conditions that may trigger the UNALIGNED > WRITE COMMAND response. Trying to write less than 4KB on a 4Kn drive. But that should not ever happen for kernel issued commands. > > Send commands that failed with an unaligned write error to the SCSI error > handler. Let the SCSI error handler sort SCSI commands per LBA before > resubmitting these. > > Increase the number of retries for write commands sent to a sequential > zone to the maximum number of outstanding commands. I think I mentioned this before. When we started btrfs work, we did something similar (but at the IO scheduler level) to try to avoid adding a big lock in btrfs to serialize (and thus order) writes. What we discovered is that it was extremely easy to fall into a situation were the maximum number of possible outstanding request is already issued, but they all are behind a "hole" and indefinitely delayed because the missing request cannot be issued due to the max nr request limit being reached. No forward progress and deadlock. I do not see how your change addresses this problem. The same will happen with this and I do not have any suggestion how to solve this. For btrfs, we ended up using cone append emulation for scsi to avoid the big lock and avoid the FS from having to order writes. That solution guarantees forward progress. Delaying already issued writes that are not sequential has no such guarantees. -- Damien Le Moal Western Digital Research