[Bug 11117] aic94xx doesn't sustain the load when more than 2 SAS drives are connected and actively used

bugme-daemon@xxxxxxxxxxxxxxxxxxx · Mon, 28 Jul 2008 07:45:06 -0700 (PDT)

http://bugzilla.kernel.org/show_bug.cgi?id=11117

------- Comment #1 from anonymous@xxxxxxxxxxxxxxxxxxxx  2008-07-28 07:45 -------
Reply-To: James.Bottomley@xxxxxxxxxxxxxxxxxxxxx

On Fri, 2008-07-18 at 08:37 -0700, bugme-daemon@xxxxxxxxxxxxxxxxxxx
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=11117
> 
>            Summary: aic94xx doesn't sustain the load when more than 2 SAS
>                     drives are connected and actively used
[...]
> aic94xx: escb_tasklet_complete: REQ_TASK_ABORT, reason=0x6
> sas: command 0xffff8101d39733c0, task 0xffff8105e9e51240, timed out:
> EH_NOT_HANDLED
> sas: command 0xffff8104db3d1e40, task 0xffff8105ed10a6c0, timed out:

This is more or less a known problem with aic94xx.  It's root cause is
that there are certain bus conditions the firmware requires help with.
REQ_TASK_ABORT is one of them (reason 0x6 means there was a protocol
error on the bus).  What the card would like is for us to abort and
retransmit that command immediately (running abort).  What we actually
do is to mark the command for abort by the error handler, halt all
in-progress commands and wake up the eh thread.  This causes a nasty
hiccough in the data flow and runs into a potential snowball effect in
that if we get another REQ_TASK_ABORT on the retry of all the halted
commands (and there are quite a number of them), we have to do
everything over again (do this too often and the command will time out).

The fix is to alter the aic94xx code to do a running abort (as in do it
itself on the single command instead of halting everything and waking
the error handler).  Unfortunately no-one's found the time to sit down
and code this up yet.

James

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html