On 3/23/23 17:26, Christoph Hellwig wrote: > On Tue, Mar 21, 2023 at 07:36:12AM -0700, Bart Van Assche wrote: >> The UFSHCI specification is very clear about the requirement that UFS host >> controllers must process SCSI commands in order if host software sets one >> bit at a time in the UFSHCI 3.0 doorbell register: "For Task Management >> Requests and Transfer Requests, software may issue multiple commands at a >> time, and may issue new commands before previous commands have completed. >> When software sets the corresponding doorbell register, the Task Management >> Requests and Transfer Requests automatically get a time stamp with their >> issue time. The commands within a command list (Task Management List or >> Transfer Request List) shall be processed in >> the order of their time stamps, starting from the oldest time stamp. In the >> case multiple commands from the same list have the same time stamp, they >> shall be processed in the order of their command list index, >> starting from the lowest index." > > But we can't write Linux software just for UFS. We have no sensible > ordering guarantee anywhere else. > >> Damien and Jens agree about introducing an additional hardware queue for >> preserving the order of zoned writes as one can see here: >> https://lore.kernel.org/linux-block/ed255a4a-a0da-a962-2da4-13321d0a75c5@xxxxxxxxx/ >> >> In our tests pipelining zoned writes (REQ_OP_WRITE) works fine as long as >> the UFS error handler is not activated. After the UFS error handler has >> been scheduled and before the SCSI host state is changed into >> SHOST_RECOVERY, the UFS host controller driver responds with >> SCSI_MLQUEUE_HOST_BUSY. I'm still working on a solution for the reordering >> caused by this mechanism. > > We'll still need REQ_OP_ZONE_APPEND as the actual file system fast path > interface. For a low-end device like UFS the sd.c emulation might be > able to take advantage of the above separate queue as an implementation > detail. For the zone append emulation, the write locking is done by sd.c and the upper layer does not restrict to one append per zone. So we actually could envision a UFS version of the sd write locking calls that is optimized for the device capabilities and we can keep a common upper layer (which is preferable in my opinion). -- Damien Le Moal Western Digital Research