Hi Raoul, We use the same disks with the same firmware here at Stratus and have never experienced the issue you're observing. Maybe it's due to the fact that the hardware raid on the AIC-9410W is enabled? If you're using md then there's no reason to keep it on. Our configurations as almost identical except for: - hardware RAID disabled - directly attached - md level 1 - seq: V32A4 - bios/firmware 2.0-2 1822/1021 The bios and firmware revs may be specific to our implementation since the SAS chip is glued to our PCI-X riser. Are your disks directly attached or are you using a SAS expander? Peter > -----Original Message----- > From: linux-scsi-owner@xxxxxxxxxxxxxxx [mailto:linux-scsi- > owner@xxxxxxxxxxxxxxx] On Behalf Of Raoul Bhatia [IPAX] > Sent: Wednesday, April 16, 2008 12:47 PM > To: Leonid Kalmankin > Cc: linux-scsi@xxxxxxxxxxxxxxx > Subject: Re: aic94xx + ST3146855SS still failing under heavy load > > hi, > > some others, like me, are struggeling with this problem. > afaik, james bottomley (or someone else?) is working on a fix, > but it will take some more time. > > please see [1] and [2]. > > btw. i asked seagate and adaptec and both did not come up with a decent > solution. seagate asked me to verify this with a different controller > and said that they know of no issue and adaptec gave me a new sequencer > firmware - so at least the server is still responding properly - and > told me that all the fixes went into the recent 2.6.25rc6+ kernel. > > cheers, > raoul > [1] http://marc.info/?t=120603924200004 > [2] http://marc.info/?t=120757821700007 > > Leonid Kalmankin wrote: > > Hello! > > > > We have a system with: > > > > vanilla 2.6.25-rc8 (2.6.23, 2.6.24 have the same behaviour) > > > > Adaptec AIC-9410W SAS (Razor ASIC RAID) (rev 09) > > aic94xx: Found sequencer Firmware version 1.1 (V30) > > (Firmware version 1.1 (V17/10c6) makes no difference) > > scsi 2:0:0:0: Direct-Access SEAGATE ST3146855SS 0002 PQ: 0 ANSI: 5 > > > > > > It reliably fails under heavy IO: > > > >> sas: command 0xffff81022c5f5640, task 0xffff8101f6b0f000, timed out: > EH_NOT_HANDLED > >> sas: command 0xffff81022c5f5500, task 0xffff8101f6b0f1c0, timed out: > EH_NOT_HANDLED > >> .... > >> sas: Enter sas_scsi_recover_host > >> sas: trying to find task 0xffff8101f6b0f000 > >> sas: sas_scsi_find_task: aborting task 0xffff8101f6b0f000 > >> aic94xx: task 0xffff8101f6b0f000 done with opcode 0x1e resp 0x0 stat > 0x8d but aborted by upper layer! > >> aic94xx: tmf tasklet complete > >> aic94xx: tmf came back > >> aic94xx: asd_abort_task: task 0xffff8101f6b0f000 done > >> aic94xx: task 0xffff8101f6b0f000 aborted, res: 0x0 > >> sas: sas_scsi_find_task: task 0xffff8101f6b0f000 is done > >> sas: sas_eh_handle_sas_errors: task 0xffff8101f6b0f000 is done > >> sas: --- Exit sas_scsi_recover_host > > > > Sometimes it successfully recovers; sometimes the disk is lost until the > reboot. > > > > I've read http://archive.netbsd.se/?ml=linux-scsi&a=2008-01&t=6260524 > > Asked Seagate about firmware update; they told me they do not have any. > > > > As I understood, the root of this problem is protocol errors in disk's > firmware > > (other disks, for example FUJITSU MBA3147RC work fine); however, that > kind of errors > > should be recoverable by sas/aic94xx drivers. > > > > If that is true, I could test some patches/ideas, where should I start? > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > ____________________________________________________________________ > DI (FH) Raoul Bhatia M.Sc. email. r.bhatia@xxxxxxx > Technischer Leiter > > IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at > Barawitzkagasse 10/2/2/11 email. office@xxxxxxx > 1190 Wien tel. +43 1 3670030 > FN 277995t HG Wien fax. +43 1 3670030 15 > ____________________________________________________________________ > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html