Re: [PATCH v2 2/6] qla2xxx: Simplify the code for aborting SCSI commands

Bart Van Assche <bvanassche@xxxxxxx> · Thu, 23 Jan 2020 19:12:04 -0800

On 2020-01-23 02:17, Daniel Wagner wrote:
> On Wed, Jan 22, 2020 at 08:23:41PM -0800, Bart Van Assche wrote:
>> Since the SCSI core does not reuse the tag of the SCSI command that is
>> being aborted by .eh_abort() before .eh_abort() has finished it is not
>> necessary to check from inside that callback whether or not the SCSI command
>> has already completed. Instead, rely on the firmware to return an error code
>> when attempting to abort a command that has already completed. Additionally,
>> rely on the firmware to return an error code when attempting to abort an
>> already aborted command.
>>
>> In qla2x00_abort_srb(), use blk_mq_request_started() instead of
>> sp->completed and sp->aborted.
>>
>> This patch eliminates several race conditions triggered by the removed member
>> variables.
> 
> I can only guess here what the races are but I agree removing the
> logic here and relying on the SCSI layer to handle it correctly makes
> sense. 

I will make the patch description more clear when I repost this patch.

>> +/*
>> + * The caller must ensure that no completion interrupts will happen
>> + * while this function is in progress.
>> + */
> 
> So could we add something like WARN_ON(irqs_disabled())?

qla2x00_abort_all_cmds() is typically called after the kernel driver
discovered that the firmware stopped processing commands. If the
firmware has stopped processing commands it is safe to call this
function without disabling interrupts. Even if the caller would disable
interrupts, that would only disable interrupts on the local CPU but not
on other CPUs. Additionally, disabling interrupts is not a proper
solution because if a completion interrupt arrives after a command has
been aborted that will cause trouble if the command handle has already
been associated with another command.

Bart.