On Sun, 3 Aug 2014, NeilBrown wrote:
You are very unlikely to see UREs just be reading the drive over and over a
again. You easily do that for years and not get an error. Or maybe you got
one just then.
Also you might get an intermittent URE. I have had drives where the sector
would be successfully be read after several attempts. Why the drive
doesn't re-write the sector when it needs hundreds or thousands of
attempts to read it, I don't know. I would very much like to talk to
someone who really knows how these things works end-to-end, but I don't
have access to anyone like that. Most of the information to be found
publically is by people deducing behaviour from experience from the
outside of this "black box".
2) how UREs should be visible? Via error reporting through dmesg?
If you want to see how the system responds when it hits a URE, you can use the
hdparm command and the "--make-bad-sector" option. There is also a
"--repair-sector" option which will (hopefully) repair the sector when you
are done.
Does this command do the same as with a real URE, ie will try until the
timeout of the drive (which is what, 90 seconds on a consumer drive, 7
seconds of an enterprise drive, right?).
If it fails immediately then it's not testing the same thing as a "real"
URE. Might be good to know if one does testing that's supposed to emulate
real failures.
--
Mikael Abrahamsson email: swmike@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html