Robin Hill wrote:
You probably need to start it with missing members then, so it's able to
run but not to resync.
This is not an option in assemble mode either. It looks as though the
array has to be recreated. I'm not sure why any of these options are not
provided for assemble.
Anyway, in the end I did an "assemble --force" after stopping what was
left (5 drives dropped from a 16 drive RAID6), and it strated but did
not initiate a resync.
Perhaps the behaviour here has changed, because I'm sure when I've done
this in the past, it resyncs straight away.
There were some somewhat strange errors in the log:
Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdf, sector
1953182527
Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdf1, disabling device.
Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 15 devices.
Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdh, sector
1953182527
Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdh1, disabling device.
Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 14 devices.
Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdg, sector
1953182527
Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdg1, disabling device.
Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 13 devices.
Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdp, sector
1953182527
Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdp1, disabling device.
Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 12 devices.
Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdr, sector
1953182527
Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdr1, disabling device.
Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 11 devices.
The cause is a controller problem, but after the first 2 drives were
disabled, I don't know why there were "raid5: Operation continuing
on..." messages as another 3 drives were offlined. A RAID6 array should
stop when a third device fails.
Regards,
Richard
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html