Re: [PATCH 3/4] scsi: improved eh timeout handler

Hannes Reinecke <hare@xxxxxxx> · Fri, 07 Jun 2013 08:42:52 +0200

On 06/07/2013 08:25 AM, Ren Mingxin wrote:
> Hi, Hannes:
> 
> On 06/07/2013 04:28 AM, Jörn Engel wrote:
>> On Thu, 6 June 2013 22:39:14 +0200, Hannes Reinecke wrote:
>>>>> +        spin_unlock_irqrestore(&sdev->list_lock, flags);
>>>>> +        SCSI_LOG_ERROR_RECOVERY(3,
>>>>> +            scmd_printk(KERN_INFO, scmd,
>>>>> +                    "aborting command %p\n", scmd));
>>>>> +        rtn = scsi_try_to_abort_cmd(shost->hostt, scmd);
>>>>> +        if (rtn == SUCCESS || rtn == FAST_IO_FAIL) {
>>>>> +            if (((scmd->request->cmd_flags&  REQ_FAILFAST_DEV) ||
>>>>
>>>> Am I being stupid again or should this be negated?
>>>>
>>> Knowing you I would think the former; where do you see the negation?
>>
>> If REQ_FAILFAST_DEV is set, this runs scsi_queue_insert(), which I
>> would expect it should run scsi_finish_command().
> 
> I also think (scmd->request->cmd_flags & REQ_FAILFAST_DEV) and
> (scmd->request->cmd_type == REQ_TYPE_BLOCK_PC) should be negated.
Bummer. You are correct.

So far I've only encountered the 'BLOCK_PC' condition, which worked.

> I'm confused why not use !scsi_noretry_cmd(scmd) directly as your
> former patch here?
> 
Hehe. I've fallen into the trap myself.
scsi_noretry_cmd() only works if you have a status for the command,
and will only evaluate REQ_FAILFAST_DEV or BLOCK_PC if the command
has a CHECK CONDITION status.
As this is the timeout handler we do _not_ have any status, so
scsi_noretry_cmd() will always tell us that the command should be
retried.

>>>>> +                 (scmd->request->cmd_type ==
>>>>> REQ_TYPE_BLOCK_PC))&&
>>>>> +                (++scmd->retries<= scmd->allowed)) {
>>>>> +                SCSI_LOG_ERROR_RECOVERY(3,
>>>>> +                    scmd_printk(KERN_WARNING, scmd,
>>>>> +                            "retry aborted command\n"));
>>>>> +
>>>>> +                scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
>>>>> +            } else {
>>>>> +                SCSI_LOG_ERROR_RECOVERY(3,
>>>>> +                    scmd_printk(KERN_WARNING, scmd,
>>>>> +                            "fast fail aborted command\n"));
>>>>> +                scmd->result |= DID_TRANSPORT_FAILFAST<<  16;
>>>>> +                scsi_finish_command(scmd);
>>>>> +            }
>>>>> +        } else {
>>>>> +            if (!scsi_eh_scmd_add(scmd, 0)) {
>>>>> +                SCSI_LOG_ERROR_RECOVERY(3,
>>>>> +                    scmd_printk(KERN_WARNING, scmd,
>>>>> +                            "terminate aborted command\n"));
>>>>> +                scmd->result |= DID_TIME_OUT<<  16;
>>>>> +                scsi_finish_command(scmd);
>>>>> +            }
>>>>> +        }
>>>>> +        spin_lock_irqsave(&sdev->list_lock, flags);
>>>>> +    }
>>>>> +    spin_unlock_irqrestore(&sdev->list_lock, flags);
>> ...
>>>>> +/**
>>>>> + * scsi_abort_command - schedule a command abort
>>>>> + * @scmd:    scmd to abort.
>>>>> + *
>>>>> + * We only need to abort commands after a command timeout
>>>>> + */
>>>>> +void
>>>>> +scsi_abort_command(struct scsi_cmnd *scmd)
>>>>> +{
>>>>> +    unsigned long flags;
>>>>> +    int kick_worker = 0;
>>>>> +    struct scsi_device *sdev = scmd->device;
>>>>> +
>>>>> +    spin_lock_irqsave(&sdev->list_lock, flags);
>>>>> +    if (list_empty(&sdev->eh_abort_list))
>>>>> +        kick_worker = 1;
>>>>> +    list_add(&scmd->eh_entry,&sdev->eh_abort_list);
>>>>> +    SCSI_LOG_ERROR_RECOVERY(3,
>>>>> +        scmd_printk(KERN_INFO, scmd, "adding to
>>>>> eh_abort_list\n"));
>>>>> +    spin_unlock_irqrestore(&sdev->list_lock, flags);
>>>>> +    if (kick_worker)
>>>>> +        schedule_work(&sdev->abort_work);
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(scsi_abort_command);
> 
> Should the name of function above be more ideographic/understandable?
> For example, scsi_abort_scmd_add? I was bewildered among functions
> named scsi_abort_eh_cmnd, scsi_eh_abort_cmds...
> 
scsi_abort_scmd_add()? That's even more confusing.

scsi_abort_command() does exactly this, it aborts a command.
Yeah, the individual wrapper/callback function naming might be
improved, but this is the one function which actually does what it
advertises ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@xxxxxxx			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html