On 8/13/23 19:41, Damien Le Moal wrote:
But at the very least, may be mention in the commit message that you also add the unit tests associated with the change ?
Hi Damien, I will mention this in the patch description when I repost this patch.
Another remark: we'll need this sorting only for devices that are zoned and do not need zone write locking. For other cases (most cases in fact), we don't and this could have some performance impact (e.g. fast RAID systems). Is there a way to have this eh_prepare_resubmit to do nothing for regular devices to avoid that ?
The common case is that all commands passed to the SCSI error handler are associated with the same ULD. For this case list_sort() iterates once over the list with commands that have to be resubmitted because all eh_prepare_resubmit pointers are identical. The code introduced in the next patch requires O(n^2) time with n the number of commands passed to the error handler. I expect this time to be smaller than the time needed to wake up the error handler if n < 100. Waking up the error handler involves expediting the grace period (call_rcu_hurry()) and a context switch. I expect that these two mechanisms combined will take more time than the list_sort() call if the number of commands that failed is below 100. In other words, I don't think that scsi_call_prepare_resubmit() will slow down error handling measurably for the cases we care about. Do you perhaps want me to change the code in the next patch such that it takes O(n) time instead of O(n^2) time in case no zoned devices are involved? Thanks, Bart.