Re: [PATCH v8 3/9] scsi: core: Call .eh_prepare_resubmit() before resubmitting

Damien Le Moal <dlemoal@xxxxxxxxxx> · Mon, 14 Aug 2023 13:18:25 +0900

On 8/14/23 12:23, Bart Van Assche wrote:
> On 8/13/23 19:41, Damien Le Moal wrote:
>> But at the very least, may be mention in the commit message that you also add
>> the unit tests associated with the change ?
> 
> Hi Damien,
> 
> I will mention this in the patch description when I repost this patch.
> 
>> Another remark: we'll need this sorting only for devices that are zoned and do
>> not need zone write locking. For other cases (most cases in fact), we don't and
>> this could have some performance impact (e.g. fast RAID systems). Is there a way
>> to have this eh_prepare_resubmit to do nothing for regular devices to avoid that ?
> 
> The common case is that all commands passed to the SCSI error handler
> are associated with the same ULD. For this case list_sort() iterates
> once over the list with commands that have to be resubmitted because
> all eh_prepare_resubmit pointers are identical. The code introduced
> in the next patch requires O(n^2) time with n the number of commands
> passed to the error handler. I expect this time to be smaller than the
> time needed to wake up the error handler if n < 100. Waking up the
> error handler involves expediting the grace period (call_rcu_hurry())
> and a context switch. I expect that these two mechanisms combined will
> take more time than the list_sort() call if the number of commands that
> failed is below 100. In other words, I don't think that
> scsi_call_prepare_resubmit() will slow down error handling measurably
> for the cases we care about.
> 
> Do you perhaps want me to change the code in the next patch such that
> it takes O(n) time instead of O(n^2) time in case no zoned devices are
> involved?

I was more thinking of a mean to not call scsi_call_prepare_resubmit() at all if
no zoned devices with use_zone_write_lock == true are involved.

-- 
Damien Le Moal
Western Digital Research