Re: Is there a drive error "retry" parameter?

Paul Clements <paul.clements@xxxxxxxxxxxx> · Wed, 15 Jun 2005 20:20:42 -0400

Michael Tokarev wrote:
Carlos Knowlton wrote:

Is there a "retry" parameter that can be set in the kernel parameters,
or else in the code itself to prolong the existence of a drive in an
array before it is considered dirty?

There's no such parameter currently.  But there was several discussions
about how to make raid code more robust - in particular, in case of
read error, raid code may keep the errored drive in the array and mark
it dirty only in case of write error.

That would be nice.  Do you know if anyone has done any work toward 
such a fix?

Looks like this is a "FAQ #1" candidate for linux softraid ;)
I tried to do just that myself, with a help from Peter T. Breuer.
The code even worked here on a test machine for some time.
But it's umm.. quite a bit ugly, and Neil is going to slightly
different direction (which I for one don't like much - the
persistent bitmaps stuff, -- I think simpler approach is better).

The persistent bitmap code has got nothing to do with read/write error 
correction. The bitmap simply keeps track of what's out of sync between 
the component drives, so you never need a full resync. On the other 
hand, read/write error correction tries to limit the conditions under 
which a drive would be kicked out of an array (thus resulting in a 
resync). Ultimately, I think we'd like to see both capabilities in md, 
though...

--
Paul

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html