On Thu, 31 May 2012 08:31:58 +0000 Andy Smith <andy@xxxxxxxxxxxxxx> wrote: > Now, is this sort of behaviour expected when under incredible load? > Or is it indicative of a bug somewhere in kernel, mpt driver, or > even flaky SAS controller/disks? This sort of high load would not affect md, except to slow it down. My guess is that the real bug is in the mpt driver, but as I know nothing about the mpt driver, you should treat that guess with a few kilos of NaCl. > > Root cause of failure aside, could I have made recovery easier? Was > there a better way than --create --assume-clean? The mis-step was to try to add the devices back to the array. A newer mdadm would refuse to let you do this because of the destructive effect. The correct step would have been to stop the array and re-assemble it, with --force. Once you had turned the devices to spares with --add, --create --assume-clean was the correct fix. > > If I had done a --create with sdc5 (the device that stayed in the > array) and the other device with the closest event count, plus two > "missing", could I have expected less corruption when on 'repair'? Possibly. You certainly wouldn't expect more. NeilBrown
Attachment:
signature.asc
Description: PGP signature