Hi, Since quite some time one of our servers running redhat linux 7.1 (SMP, SCSI) and raid1 on two identical SCSI disks is giving me nightmares (see my mail from Nov 14th, 2001): after some time read errors like the following occur, causing the raid to get out of sync. Additional sense indicates Unrecovered read error I/O error: dev 08:19, sector 12850360 raid1: sdb9: rescheduling block 12850360 md: recovery thread got woken up ... md4: no spare disk to reconstruct array! -- continuing in degraded mode md: recovery thread finished ... raidhotremove/raidhotadding the faulty partition works to get the raid in sync again, but these errors keep occuring faster and faster since at one point in time it's nearly impossible to sync the raid again. what we already did to overcome this error: - performed RAM checks - replaced the "faulty" disks with new ones - replaced the scsi controller (new one has a newer bios release) - replaced scsi cabling - checked disks with vendor programs - checked CPU temp. & fan speed - reduced transfer rate on scsi bus needles to say that the checks didn't find any errors. strange thing is that after such an replacement action the system works fine for a while, but after some weeks the problem starts all over again: [rfu@host tmp]$ cp /tmp/IBMJava2-SDK-13.tgz . [rfu@host tmp]$ diff /tmp/IBMJava2-SDK-13.tgz ./IBMJava2-SDK-13.tgz [rfu@host tmp]$ diff /tmp/IBMJava2-SDK-13.tgz ./IBMJava2-SDK-13.tgz Binary files /tmp/IBMJava2-SDK-13.tgz and ./IBMJava2-SDK-13.tgz differ [rfu@host tmp]$ diff /tmp/IBMJava2-SDK-13.tgz ./IBMJava2-SDK-13.tgz Binary files /tmp/IBMJava2-SDK-13.tgz and ./IBMJava2-SDK-13.tgz differ (but no disk R/W error reported by kernel) [rfu@host tmp]$ (current directory is in /home/rfu/tmp which is located on a different partition than /tmp) might it be a bug in the disk caching subsystem of the kernel ? strange that the first diff works, but the subsequent ones don't. I meanwhile think that I'm in the wrong list here since I guess that it is not the fault of the raid subsystem; I'd be glad if anyone could point me to a suitable place for my problem (or get rid of it). tnx in advance. rainer. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html