David Lethe wrote:
What firmware, drivers & BIOS is the LSI controller running, and what is exact model number?
The card is the SAS3442E-R using the B3 version of the 1068 controller and has the latest public versions of BIOS and IT version of the firmware.
Several things to consider - if you enabled SMART rather than telling the controller to enable SMART for the individual drives, then this will cause a problem depending on specifics of what you have .. especially if the controller is running the RAID firmware. - There are firmware issues with some LSI chipsets and driver/bios/MPT-library revision logic which can cause bus resets. In this case, the bus reset made the controller think the disk timed out to whatever I/O operations the LSI controller told it to perform ... so the controller took disk to offline state.
At this stage I am no longer concerned about using smartmontools - the card has performed flawlessly in all other respects, so I will avoid it in future.
I am concerned that when the drive was offlined, md was not made aware of it. Perhaps this is to be expected?
Unfortunately this machine is in production now, so I cannot really participate in any more testing/debugging.
Regards, Richard -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html