On 12/3/19 5:14 am, Wols Lists wrote:
On 11/03/19 12:31, Nix wrote:
On 10 Mar 2019, Wols Lists uttered the following:
I'd like to modify the raid layer such that it times out quickly, and
recalculates and rewrites the data after a few seconds, such that these
drives cease to be a problem, but stick that on the long list of raid
papercuts I'd like to sort out when I can find the time to learn to
program the raid subsystem!
I don't see how that could work. When these drives get stuck on lengthy
retries, they are essentially unresponsive:
So any code needs to take that in to account. Pain in the arse, but when
the linux read times out, the re-write code needs to detect that the
drive is one of these cheapos, and spawn a thread that waits for the
drive time-out before rewriting it.
Of course, that's going to cause a host of other issues that will need
sorting/fixing :-) - the obvious one is what happens if something else
re-writes that block in the middle of the time-out period ...
Cheers,
Wol
Doesn't this happen already? The drive will either return the data (if
it magically succeeds in reading the requested data in that 180?
seconds, or it will return a read error. If MD gets the data, it carries
on normally, (albeit with a delay). If MD gets a read error, it will
automatically reconstruct the data (assuming a working raid array with
sufficient redundancy to do that without the data we were trying to
read), and issue a write to the drive. If the drive fails to write the
data and returns an error, then the drive is kicked from the array.
AFAIR, the "problem" was that the kernel isn't configured (by default)
to wait 180s, so it will try to reset the SATA bus, and trigger a failed
read response to MD, MD will issue the write request, the kernel is
trying to re-contact the drive and the drive is still busy trying to
complete the original read request, we get a second timeout, the kernel
try to reset the SATA bus again and triggers a failed write request to
MD, which now kicks the drive.
So, as long as root (ie, the administrator) configures the kernel to
match the installed hardware (or the distribution magically detects and
configures this on behalf of the administrator) then everything works
well (ie, no loss of redundancy, no failed RAID arrays) due to a single
failed read request. Of course, there is still the 180s delay/freeze,
but that is a "better" overall outcome, and results in a good solution
for most admins/users.
If it becomes a problem, then the admin can fix (replace) the hardware
with better options and solve both problems (reducing the "freeze/delay"
from around 180s to around 7s (btw, a 7s delay could also be
unacceptable for any number of users/admins, personally, my users insist
on a 0.1s delay or less for *everything* and anything worse is a major
incident).
Regards,
Adam
--
Adam Goryachev Website Managers www.websitemanagers.com.au
--
The information in this e-mail is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this e-mail by anyone else
is unauthorised. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful. If you have received this message
in error, please notify us immediately. Please also destroy and delete the
message from your computer. Viruses - Any loss/damage incurred by receiving
this email is not the sender's responsibility.