On 3/31/19 00:03, Phil Turmel wrote:
Hi Jorge,
Very good report.
Thanks. You can credit the clear instructions on the wiki for that.
On 3/30/19 7:02 PM, Jorge R. Frank wrote:
Have determined that the drives came with SCT ERC disabled by default.
I ran the script on this page successfully but have not yet set it to
run at every boot:
<https://raid.wiki.kernel.org/index.php/Timeout_Mismatch>
Consider not buying cheap drives when the time comes to replace. The
boot script will suit until then.
In my defense, I was young, stupid, and unsupervised when I built the
array. Hard to argue with the results. The system has been running
practically 24/7 since December 2008 and this is the first glitch I
couldn't fix by simply re-seating SATA cables and rebooting.
One thing I would like to confirm is where to call the SCT ERC script in
the boot process. The wiki wasn't clear on that point.
All of this is consistent with a controller issue knocking out those two
drives simultaneously. The correct solution is to use --assemble
--force with explicit device names (not using --scan).
You should use fsck to clean up any unavoidable fs corruption from
in-flight I/O before mounting.
Would you recommend explicitly including all four devices, since sdd and
sde have the same event count? Or just three, arbitrarily picking one of
sdd/sde to include, then adding a new fourth drive? Due to the age of
the system and the fact that the motherboard SATA controller now has a
strike against it, my plan upon recovery is to immediately back up the
array and replace the entire system. So if the former would work on a
short-term basis, I'd be willing to try it.
Thanks again,
JRF