Re: mdadm expanded 8 disk raid 6 fails in new server, 5 original devices show no md superblock

Wilson Jonathan <piercing_male@xxxxxxxxxxx> · Wed, 15 Jan 2014 12:50:19 +0000

On Tue, 2014-01-14 at 13:43 -0500, Phil Turmel wrote:
> On 01/14/2014 12:47 PM, Wilson Jonathan wrote:
> 
> [trim /]
> 
> > I understand the issue of "timeout" on drives that might perform long
> > error checking which then causes mdadm, via the device (block?) driver
> > issuing a time out, to then kick the drive. In this instance you allow
> > some time for a drive to try and fix things at the expense of a hung
> > array for a longer period of time.
> > 
> > I also understand that with scterc the drive gives up (in effect timing
> > its self out) when it hits the 7 second, or there about, mark and
> > subsequently mdadm kicks the drive out. In this specific instance the
> > idea is to kill a drive quickly to that the raid doesn't hang longer
> > than a few seconds.
> 
> No.  The intent is to fail the read without failing the controller channel.

Arrr, thanks for the clarification... I hadn't realised that instead of
the drive returning a "Error, I can't get the data, I'm dead in the
water" message it instead returned a "warning, I can't get the data, you
deal with it and get back to me, I'm still working" kind of affair.

> 
> > However surely these things (bar the amount of time) result in the same
> > final result of a drive being kicked out. Even in a non-madam hardware
> > raid set up, the drive is either kicked because it didn't return in 7
> > seconds, or the drive kicks its self because it gave up before 7
> > seconds.
> 
> No.  Upon a failed read, MD will obtain/reconstruct the problem sector
> from remaining redundancy, then write the correct data back.  Occasional
> read errors of this type are normal, and fix themselves when the sector
> is written again.  MD will only fail a drive after *multiple* read
> errors, not just one.  (Isolated bursts of up to 20, then ~ ten per hour.)
> 

I see now... I had totally the wrong idea of what happened and how they
differed. 

> [trim /]
> 
> > Surely, unless I'm missing something, rebuilding a failed drive's data
> > means that you want the system to not kick if at all possible and having
> > scterc enabled or a short timeout (shorter than the drives max time,
> > unless that time is indefinite retry) is the last thing you want?
> 
> What you are missing is what happens when the controller channel times
> out.  The original read is reported failed to MD while the driver tries
> to revive the unresponsive drive.  MD proceeds to obtain/reconstruct the
> missing data, then write back.  But the device is not communicating--the
> driver has reset the channel, and will continue not communicating until
> the drive firmware finally gives up on the original read.  So the
> *write* fails instantly, kicking the drive out of the array.
> 
> When you, the admin, get around to looking, the drive is idle but
> apparently fine.  (It gains a "pending" sector, which stays until the
> drive is told to write over that spot.)
> 
> HTH,

It does, thanks for the information :-)

> 
> Phil
> 

Jon

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html