Hi,
I'm running 2.6.27 with LVM over software RAID 1 over a pair of SAS disks.
Recently we started seeing messages of the following pattern:
Nov 28 08:57:10 kernel: end_request: I/O error, dev sda, sector 1758169523
Nov 28 08:57:10 kernel: md: super_written gets error=-5, uptodate=0
Nov 28 08:57:10 kernel: raid1: Disk failure on sda2, disabling device.
Nov 28 08:57:10 kernel: raid1: Operation continuing on 1 devices.
We're working through our changes to figure out what might have
triggered it, but it seems likely the root cause lies in the core code.
We're assuming it's a software issue since it's reproducible on multiple
new-ish systems, although so far we've only tried it on systems with one
particular configuration--we're planning on trying it with different
disks just to be sure.
For what it's worth, we've seen the problems with disk write cache
enabled and disabled.
Anyone have any ideas, or pointers as to what I should look at?
Thanks,
Chris
--
Chris Friesen
Software Designer
3500 Carling Avenue
Ottawa, Ontario K2H 8E9
www.genband.com
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html