Re: Debugging scsi abort handling ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/29/2014 12:39 PM, Hans de Goede wrote:
Hi,

On 08/29/2014 12:30 PM, Hannes Reinecke wrote:
On 08/29/2014 12:14 PM, Finn Thain wrote:

On Fri, 29 Aug 2014, Hannes Reinecke wrote:

On 08/29/2014 06:39 AM, Finn Thain wrote:

On Thu, 28 Aug 2014, Hannes Reinecke wrote:

What might happen, though, that the command is already dead and gone
by the time you're calling ->scsi_done() (if you call it after
eh_abort). So there might not _be_ a command upon which you can call
->scsi_done() to start with.

Hence any LLDD need to clear up any internal references after a call
to eh_XXX to ensure it doesn't call ->scsi_done() an in invalid
command.

So even if the LLDD returns 'FAILED' upon a call to eh_XXX it
_still_ needs to clear up the internal reference.

This is a question that has been bothering me too. If the host's
eh_abort_cmd() method returns FAILED, it seems the mid-layer is liable
to re-issue the same command to the LLD (?)

No.
FAILED for any eh_abort_cmd() means that the TMF hasn't been sent.

Makes sense, though it appears to contradict this advice about returning
SUCCESS in some situations:
http://marc.info/?l=linux-scsi&m=140923498632496&w=2

Well, if the LLDD detects an invalid command (ie if it cannot find any
>> internal command matching the midlayer command) that's an automatic success, obviously.

So we should rephrase things to:

- The eh_XXX callback shall return 'SUCCESS' if the respective
   TMF (or equvalent) could be initiated or if the matching command
   reference has already been completed by the LLDD. Otherwise
   the eh_XXX callback shall return 'FAILED'.

Your talking about "could be initiated", so that means that at this
point the abort does not yet have to be completed, do I get that
right? What should the LLDD then do when the abort finishes,
call eh_scsi_done on the cmnd ?

Correct. It's up to the LLDD whether it waits for the TMF to complete before returning or if it just kicks off the TMF and
returns immediately.
In the latter case the LLDD obviously has to be prepared to handle
concurrent TMFs.

scsi_eh_done() is the internal 'scsi_done' callback for commands
issued during SCSI EH. This is _not_ the completion routine for TMFs. That's again up to the LLDD to implement TMF completion if he chooses to implement synchronous TMFs.

No LLDD should _ever_ touch nor call scsi_eh_done().

What about the abort never finishing (timeout), does the mid layer
track this, or should the LLDD do that?

TMFs have not timeout associated with them (sadly), to the LLDD needs to track it internally.
(And please do. We've run into quite some issues with LLDDs _not_
implementing a TMF timeout.)

Cheers,

Hannes
--
Dr. Hannes Reinecke		      zSeries & Storage
hare@xxxxxxx			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux