On 11/26/23 23:09, Christoph Hellwig wrote:
I still think it is a very bad idea to add this amount of complexity to the SCSI code, for a model that can't work for the general case and diverges from the established NVMe model.
Hi Christoph, Here is some additional background information: * UFS vendors prefer the SCSI command set because they combine it with the M-PHY transport layer. This combination is more power efficient than NVMe over PCIe. According to the information I have available power consumption in the M-PHY hibernation state is lower than in the PCIe L2 state. I have not yet heard about any attempts to combine the NVMe command set with the M-PHY transport layer. Even if this would be possible, it would fragment the mobile storage market. This would increase the price of mobile storage devices which is undesirable. * I think that the "established NVMe model" in your email refers to the NVMe zone append command. As you know there is no zone append in the SCSI ZBC standard. * Using the software implementation of REQ_OP_ZONE_APPEND in drivers/scsi/sd_zbc.c is not an option. REQ_OP_ZONE_APPEND commands are serialized by that implementation. This serialization is unavoidable because a SCSI device may respond with a unit attention condition to any SCSI command. Hence, even if REQ_OP_ZONE_APPEND commands are submitted in order, these may be executed out-of-order. We do not want any serialization of SCSI commands because this has a significant negative performance impact on IOPS for UFS devices. The latest UFS devices support more than 300 K IOPS. * Serialization in the I/O scheduler of zoned writes also reduces IOPS more than what is acceptable. Hence the approach of this patch series to support pipelining of zoned writes even if no I/O scheduler has been configured. I think the amount of complexity introduced by this patch series in the SCSI core is reasonable. No new states are introduced in the SCSI core. A single call to a function that reorders pending SCSI commands is introduced in the SCSI error handler (scsi_call_prepare_resubmit()). Thanks, Bart.