Re: libata error handling

Mike Anderson <andmike@xxxxxxxxxx> · Fri, 19 Aug 2005 13:29:54 -0700

Luben Tuikov <luben_tuikov@xxxxxxxxxxx> wrote:
> On 08/19/05 15:38, Patrick Mansfield wrote:
> The eh_timed_out + eh_strategy_handler is actually pretty perfect,
> and _complete_, for any application and purpose in recovering a
> LU/device/host (in that order ;-) ).
> 
> > The two problems I see with the hook are:
> > 
> > It calls the driver in interrupt context, so the called function can't
> > sleep.
> 
> Consider this: When SCSI Core told you that the command timed out,
> 	A) it has already finished,
> 	B) it hasn't already finished.
> 
> In case A, you can return EH_HANDLED.  In case B, you return
> EH_NOT_HANDLED, and deal with it in the eh_strategy_handler.
> (Hint: you can still "finish" it from there.)
> 

But dealing with it in the eh_strategy_handler means that you may be
stopping all IO on the host instance as the first lun returns
EH_NOT_HANDLED for LUN based canceling.

I still think we can do better here for an LLDD that cannot execute a
cancel in interrupt context.

Having a error handler that works is a plus, I would hope that
some factoring would happen over time from the eh_strategy_handler to
some transport (or other factor point) error handler. I would think from a
testing, support, and block level multipath predictability sharing code
would be a good goal.

-andmike
--
Michael Anderson
andmike@xxxxxxxxxx
-
: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html