Hello, we've raid5 configured and removed one disk. The system hangs over one minute on io (try to copy a big file, cp is in 'uninterruptible sleep') before continuing in degraded mode. Lots of scsi errors occurred while pending (kernel 2.4.19). Is it possible to reduce this dead time? Where is it controlled that md recognizes disk failure at 17:37:09 but remove sde1 at 17:38:23, over one minute later? I did a look into the md.c and other sources/includes, found the printk() messages but I'm not familiar with the conzept... please help. Excerpt of /var/log/messages: Apr 25 17:37:09 r16 kernel: SCSI disk error : host 1 channel 0 id 2 lun 0 return code = 10000 Apr 25 17:37:09 r16 kernel: I/O error: dev 08:41, sector 5396720 Apr 25 17:37:09 r16 kernel: raid5: Disk failure on sde1, disabling device. Operation continuing on 3 devices Apr 25 17:37:09 r16 kernel: md: recovery thread got woken up ... Apr 25 17:37:09 r16 kernel: md: updating md5 RAID superblock on device Apr 25 17:37:09 r16 kernel: md: sdh1 [events: 00000003]<6>(write) sdh1's sb offset: 5124608 Apr 25 17:37:09 r16 kernel: SCSI disk error : host 1 channel 0 id 2 lun 0 return code = 10000 Apr 25 17:37:09 r16 kernel: I/O error: dev 08:41, sector 5396728 Apr 25 17:37:10 r16 kernel: md: sdg1 [events: 00000003]<6>(write) sdg1's sb offset: 5124608 Apr 25 17:37:10 r16 kernel: SCSI disk error : host 1 channel 0 id 2 lun 0 return code = 10000 Apr 25 17:37:10 r16 kernel: I/O error: dev 08:41, sector 5396992 ... SCSI disk error... + I/O error... Apr 25 17:37:14 r16 kernel: md: sdf1 [events: 00000003]<6>(write) sdf1's sb offset: 5124608 ... SCSI disk error... + I/O error... Apr 25 17:37:15 r16 kernel: md: (skipping faulty sde1 ) Apr 25 17:37:15 r16 kernel: md5: no spare disk to reconstruct array! -- continuing in degraded mode Apr 25 17:37:15 r16 kernel: md: recovery thread finished ... ... SCSI disk error... + I/O error... Apr 25 17:38:09 r16 kernel: scsi1:0:2:0: Attempting to queue an ABORT message Apr 25 17:38:09 r16 kernel: scsi1: Dumping Card State while idle, at SEQADDR 0x8 ... driver messages ... Apr 25 17:38:09 r16 kernel: (scsi1:A:2:0): Queuing a recovery SCB Apr 25 17:38:09 r16 kernel: scsi1:0:2:0: Device is disconnected, re-queuing SCB Apr 25 17:38:09 r16 kernel: Recovery code sleeping Apr 25 17:38:09 r16 kernel: Recovery SCB completes Apr 25 17:38:09 r16 kernel: Recovery code awake Apr 25 17:38:09 r16 kernel: aic7xxx_abort returns 0x2002 Apr 25 17:38:09 r16 kernel: scsi1:0:2:0: Attempting to queue a TARGET RESET message Apr 25 17:38:09 r16 kernel: scsi1:0:2:0: Command not found Apr 25 17:38:09 r16 kernel: aic7xxx_dev_reset returns 0x2002 Apr 25 17:38:15 r16 kernel: scsi: device set offline - not ready or command retry failed after bus reset: host 1 channel 0 id 2 lun 0 Apr 25 17:38:15 r16 kernel: SCSI disk error : host 1 channel 0 id 2 lun 0 return code = 10000 Apr 25 17:38:15 r16 kernel: I/O error: dev 08:41, sector 5396760 Apr 25 17:38:15 r16 kernel: I/O error: dev 08:41, sector 5396768 Apr 25 17:38:23 r16 kernel: md: trying to remove sde1 from md5 ... Apr 25 17:38:23 r16 kernel: RAID5 conf printout: Apr 25 17:38:23 r16 kernel: --- rd:4 wd:3 fd:1 Apr 25 17:38:23 r16 kernel: disk 0, s:0, o:0, n:0 rd:0 us:1 dev:sde1 Apr 25 17:38:23 r16 kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdf1 Apr 25 17:38:23 r16 kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdg1 Apr 25 17:38:23 r16 kernel: disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdh1 Apr 25 17:38:23 r16 kernel: RAID5 conf printout: Apr 25 17:38:23 r16 kernel: --- rd:4 wd:3 fd:1 Apr 25 17:38:23 r16 kernel: disk 0, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] Apr 25 17:38:23 r16 kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdf1 Apr 25 17:38:23 r16 kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdg1 Apr 25 17:38:23 r16 kernel: disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdh1 Apr 25 17:38:23 r16 kernel: md: unbind<sde1,3> Apr 25 17:38:23 r16 kernel: md: export_rdev(sde1) Apr 25 17:38:23 r16 kernel: md: updating md5 RAID superblock on device Apr 25 17:38:23 r16 kernel: md: sdh1 [events: 00000004]<6>(write) sdh1's sb offset: 5124608 Apr 25 17:38:23 r16 kernel: md: sdg1 [events: 00000004]<6>(write) sdg1's sb offset: 5124608 Apr 25 17:38:23 r16 kernel: md: sdf1 [events: 00000004]<6>(write) sdf1's sb offset: 5124608 Thanx, Andreas.Kahnt@coware.de Coware AG --------------------------------------------------------- Landsberger Str. 402 D-81241 München Telefon +49 (0)89 568 236 - 22, Fax -70 www.coware.de - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html