Infinite fail-retry loop on degraded RAID1 array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am experiencing a problem that I think was supposed to be fixed some
time ago (if I read the previous discussion correctly) but appears to
have resurfaced on my somewhat unusual configuration.

configuration info:

- PowerPC (440gx)
- kernel 2.6.12 with preemption disabled
- Adaptec AIC-9410 SAS/SATA controller
- two SATA drives
- "monolithic" Adaptec adp94xx driver that presents SATA drives as sd
SCSI devices without using libata
- degraded two-drive RAID 1 array, rebuilding to new disk when the
"good" disk starts getting read errors.

Obviously, the rebuild cannot complete without at least one good drive.

However, the problem is that the kernel keeps trying to read the bad
block on the "good" disk forever.  This swamps the serial console and
makes the network stack unresponsive to anything other than pings.

The system log looks like this:

adp94xx SATA Status = 0x51
adp94xx Completion Status Error = 0x40
(scsi0: Ch 1 Id 128 Lun 0): Abort requested for SCSI cmd dfc55060,
opcode 0x28.
(scsi0: Ch 1 Id 128 Lun 0): Abort requested for SCSI cmd dfc55060,
opcode 0x0.
(scsi0: Ch 1 Id 128 Lun 0): Cmd dfc55060 found on device queue.
end_request: I / O? error, dev sdb, sector 3251760
scsi0 (128:0): rejecting I / O? to offline device
raid1: sdb: rescheduling sector 3251760
scsi0 (128:0): rejecting I / O? to offline device
raid1: sdb: redirecting sector 3251760 to another mirror
scsi0 (128:0): rejecting I / O? to offline device
raid1: sdb: rescheduling sector 3251760
raid1: sdb: redirecting sector 3251760 to another mirror
scsi0 (128:0): rejecting I / O? to offline device
raid1: sdb: rescheduling sector 3251760
raid1: sdb: redirecting sector 3251760 to another mirror
scsi0 (128:0): rejecting I / O? to offline device
raid1: sdb: rescheduling sector 3251760

...ad infinitem

Is this refusal of the RAID system to give up in the face of an
unresolvable error a known/expected behavior?  Or is something in my
configuration causing unusually degenerate behavior?

Many thanks,
~Matt Harding~
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux