re :SCSI error handling -- one error blocks the whole SCSI host

Jack Wang <jinpu.wang@xxxxxxxxxxxxxxxx> · Thu, 23 May 2013 21:07:34 +0200

> James, am I understanding your suggestion properly?  If so can you
> explain what you meant about the libsas code -- I see that it has its
> own strategy handler but as I said before we've already stopped every
> device attached to the HBA before we ever get there.
> 
> To recapitulate the problem here, we might have a whole fabric
> attached to an HBA via SAS or FC, and be doing 500K IOPS happily to 50
> devices.  Then a single LUN goes wonky and all the IO stops while we
> try to recover that single device, which might take minutes.

I'm not James, but from my experience in pm8001 and libsas, your
understanding is right. and when one error happens on one lun, scsi core
do hold the whole scsi host.

I think Hannes has some good proposal weeks ago, it looks reasonable,
but don't what the status now.

Regards
Jack Wang
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html