On 1/10/25 04:02, Bart Van Assche wrote: > On 11/19/24 12:01 AM, Damien Le Moal wrote: >> On 11/19/24 09:27, Bart Van Assche wrote: >>> This patch series improves small write IOPS by a factor of four (+300%) for >>> zoned UFS devices on my test setup with an UFSHCI 3.0 controller. Although >>> you are probably busy because the merge window is open, please take a look >>> at this patch series when you have the time. This patch series is organized >>> as follows: >>> - Bug fixes for existing code at the start of the series. >>> - The write pipelining support implementation comes after the bug fixes. >> >> Impressive improvements but the changes are rather invasive. Have you tried >> simpler solution like forcing unplugging a zone write plug from the driver once >> a command is passed to the driver and the driver did not reject it ? It seems >> like this would make everything simpler on the block layer side. But I am not >> sure if the performance gains would be the same. > > (replying to an email from two months ago) > > Hi Damien, > > Here is a strong reason why the simpler solution mentioned above is not > sufficient: if anything goes wrong with the communication between UFS > host controller and UFS device (e.g. a command timeout or a power mode > transition fails) then the SCSI error handler is activated. This results > in ufshcd_err_handler() being called. That function resets both the host > controller and the UFS device (ufshcd_reset_and_restore()). At that time > multiple commands may be outstanding. > > In other words, submitting UFS commands in order is not sufficient. An > approach is needed that is compatible with the SCSI error handler and > also that ensures that commands are resubmitted in LBA order per zone > after the SCSI error handler has completed. If the failed commands are retried, they will be requeued and you will not see the error as the request will not be completed yet, no ? And if you do see the error back in the block layer, you cannot just retry the command at will. The issuer must do that, no ? I am confused... Please send patches to discuss based on code. That will be easier. > > The statistics I have access to show that the UFS error handler is > activated infrequently or never on any single device but also that it is > activated a nontrivial number of times across the entire device > population. > > Thanks, > > Bart. > -- Damien Le Moal Western Digital Research