On Sun, 23 Sep 2007 15:19:13 +0200 "Torsten Kaiser" <just.for.lkml@xxxxxxxxxxxxxx> wrote: > On 9/21/07, Jens Axboe <jens.axboe@xxxxxxxxxx> wrote: > > SG chaining bits: > > - This is the bulk of the patchset. It consists of three major > > components: > > > > - sglist-core, which add helpers for iterating sg lists and > > switches the block layer and SCSI to use those. Should not > > have any functional changes. > > - sglist-drivers, which converts drivers to use the sg list > > helpers. Again, should not contain functional changes. > > - sglist-arch, which adds support to most architectures and > > actually enables sg chaining. > > Adding linux-ide and linux-scsi as CC like Andrew did with my last report. > > I still have trouble with my Silicon Image, Inc. SiI 3132 Serial ATA > Raid II Controller as reported on 2.6.23-rc4-mm1 on the new > 2.6.23-rc6-mm1. > > I'm not 100% sure if this caused by the sg chaining, but the patch > from http://lkml.org/lkml/2007/9/10/251 which touches that chaining > makes a difference, so it might be related. > > First report: http://lkml.org/lkml/2007/9/1/92 > With patch it fails fewer times: http://lkml.org/lkml/2007/9/14/107 > > To update the statistik: > prior to 2.6.23-rc4-mm1: no trouble with any drives on the SiI 3132. > 2.6.23-rc4-mm1 without patch: 2 out of 2 bad. > back to 2.6.23-rc3-mm1: 18x good. > 2.6.23-rc4-mm1 with patch: 2 out of 8 bad > after that second mail: > 2.6.23-rc4-mm1 with patch: 1 out of 5 bad > 2.6.23-rc6-mm1: 1 out of 2 bad git-block.patch in 2.6.23-rc6-mm1 includes my patch that disables sg chaining for libata but it still includes libata's sg chaining changes. So these changes breaks libata or libata was broken after 2.6.23-rc3-mm1. Can you try Jens's sglist-arch branch? If it works, probably libata in -mm has bugs. For your convenience, I put a sglist-arch branch patch against v2.6.23-rc7: http://www.kernel.org/pub/linux/kernel/people/tomo/misc/v2.6.23-rc7-sglist-arch.diff.bz2 > switching back to 2.6.23-rc3-mm1 to rule out the hardware: > 2.6.23-rc3-mm1: 6x good > > The error messages from the failed 2.6.23-rc6-mm1: > Sep 18 18:50:01 treogen [ 33.340000] md1: bitmap initialized from > disk: read 10/10 pages, set 0 bits > Sep 18 18:50:01 treogen [ 33.340000] created bitmap (145 pages) for device md1 > Sep 18 18:50:01 treogen [ 63.440000] ata1.00: exception Emask 0x0 > SAct 0x1 SErr 0x0 action 0x6 frozen > Sep 18 18:50:01 treogen [ 63.440000] ata1.00: cmd > 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out > Sep 18 18:50:01 treogen [ 63.440000] res > 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) > Sep 18 18:50:01 treogen [ 63.440000] ata1.00: status: {DRDY } > Sep 18 18:50:01 treogen [ 63.440000] ata1: hard resetting link > Sep 18 18:50:01 treogen [ 65.740000] ata1: softreset failed (port not ready) > Sep 18 18:50:01 treogen [ 65.740000] ata1: reset failed (errno=-5), > retrying in 8 secs > Sep 18 18:50:01 treogen [ 73.440000] ata1: hard resetting link > Sep 18 18:50:01 treogen [ 75.740000] ata1: softreset failed (port not ready) > Sep 18 18:50:01 treogen [ 75.740000] ata1: reset failed (errno=-5), > retrying in 8 secs > Sep 18 18:50:01 treogen [ 83.440000] ata1: hard resetting link > Sep 18 18:50:01 treogen [ 85.740000] ata1: softreset failed (port not ready) > Sep 18 18:50:01 treogen [ 85.740000] ata1: reset failed (errno=-5), > retrying in 33 secs > Sep 18 18:50:01 treogen [ 118.440000] ata1: limiting SATA link speed > to 1.5 Gbps > Sep 18 18:50:01 treogen [ 118.440000] ata1: hard resetting link > Sep 18 18:50:01 treogen [ 120.740000] ata1: softreset failed (port not ready) > Sep 18 18:50:01 treogen [ 120.740000] ata1: reset failed, giving up > Sep 18 18:50:01 treogen [ 120.740000] ata1.00: disabled > Sep 18 18:50:01 treogen [ 120.740000] ata1: EH complete > Sep 18 18:50:01 treogen [ 120.740000] sd 0:0:0:0: [sda] Result: > hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK > Sep 18 18:50:01 treogen [ 120.740000] end_request: I/O error, dev > sda, sector 625137161 > Sep 18 18:50:01 treogen [ 120.740000] md: super_written gets > error=-5, uptodate=0 > Sep 18 18:50:01 treogen [ 120.740000] raid5: Disk failure on sda2, > disabling device. Operation continuing on 2 devices > > After that many more errors like this, only differing in the sector number: > Sep 18 18:50:01 treogen [ 120.810000] sd 0:0:0:0: [sda] Result: > hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK > Sep 18 18:50:01 treogen [ 120.810000] end_request: I/O error, dev > sda, sector 19550919 > > Any more infos needed? > > Torsten > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html