On 8/14/23 12:23, Bart Van Assche wrote: > On 8/13/23 19:41, Damien Le Moal wrote: >> But at the very least, may be mention in the commit message that you also add >> the unit tests associated with the change ? > > Hi Damien, > > I will mention this in the patch description when I repost this patch. > >> Another remark: we'll need this sorting only for devices that are zoned and do >> not need zone write locking. For other cases (most cases in fact), we don't and >> this could have some performance impact (e.g. fast RAID systems). Is there a way >> to have this eh_prepare_resubmit to do nothing for regular devices to avoid that ? > > The common case is that all commands passed to the SCSI error handler > are associated with the same ULD. For this case list_sort() iterates > once over the list with commands that have to be resubmitted because > all eh_prepare_resubmit pointers are identical. The code introduced > in the next patch requires O(n^2) time with n the number of commands > passed to the error handler. I expect this time to be smaller than the > time needed to wake up the error handler if n < 100. Waking up the > error handler involves expediting the grace period (call_rcu_hurry()) > and a context switch. I expect that these two mechanisms combined will > take more time than the list_sort() call if the number of commands that > failed is below 100. In other words, I don't think that > scsi_call_prepare_resubmit() will slow down error handling measurably > for the cases we care about. > > Do you perhaps want me to change the code in the next patch such that > it takes O(n) time instead of O(n^2) time in case no zoned devices are > involved? I was more thinking of a mean to not call scsi_call_prepare_resubmit() at all if no zoned devices with use_zone_write_lock == true are involved. -- Damien Le Moal Western Digital Research