On 8/25/23 01:52, Bart Van Assche wrote: > On 8/24/23 09:44, Hannes Reinecke wrote: >> On 8/24/23 16:47, Bart Van Assche wrote: >>> Thanks for the feedback. I agree that it would be great to have zone >>> append >>> support in F2FS. However, I do not agree that switching from regular >>> writes >>> to zone append in F2FS would remove the need for sorting SCSI commands >>> by LBA in the SCSI error handler. Even if F2FS would submit zoned writes >>> then the following mechanisms could still cause reordering of the zoned >>> write after these have been translated into regular writes: >>> * The SCSI protocol allows SCSI devices, including UFS devices, to >>> respond >>> with a unit attention or the SCSI BUSY status at any time. If multiple >>> write commands are pending and some of the pending SCSI writes are not >>> executed because of a unit attention or because of another reason, this >>> causes command reordering. >> >> Yes. But the important thing to remember is that with 'zone append' the >> resulting LBA will be returned on completion, they will _not_ be >> specified in the submission. So any command reordering doesn't affect >> the zone append commands as they heven't been written yet. >> >>> * Although the link between the UFS controller and the UFS device is >>> pretty >>> reliable, there is a non-zero chance that a SCSI command is lost. If this >>> happens the SCSI timeout and error handlers are activated. This can cause >>> reordering of write commands. >>> >> Again, reordering is not an issue with zone append. With zone append you >> specify in which zone the command should land, and upon completion the >> LBA where the data is written will be returned. >> >> So if there is an error the command has not been written, consequently >> there is no LBA to worry about, and you can reorder at will. > > Hi Hannes, > > I agree that reordering is not an issue for NVMe zone append commands. > It is an issue however with SCSI devices because there is no zone append > command in the SCSI command set. The sd_zbc.c code translates zone > appends (REQ_OP_ZONE_APPEND) into regular WRITE commands. If these WRITE > commands are reordered, the ZBC standard requires that these commands > fail with an UNALIGNED WRITE error. So I think for SCSI devices what you > wrote is wrong. I think that Hannes point was that if you ensure that the rejected regular write commands used to emulate zone append when requeued go through the sd driver again when resubmitted, they will be changed again to emulate the original zone append using the latest wp location, which is assumed correct. And that does not depend on the ordering. So requeuing these regular writes does not need sorting. It can be in any order. The constraint is of course that they must be re-preped from the original REQ_OP_ZONE_APPEND every time they are requeued. -- Damien Le Moal Western Digital Research