Re: Software RAID when it works and when it doesn't

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2007-10-16 at 17:57 -0400, Mike Accetta wrote:

> Was the disk driver generating any low level errors or otherwise
> indicating that it might be retrying operations on the bad drive at
> the time (i.e. console diagnostics)?  As Neil mentioned later, the md layer
> is at the mercy of the low level disk driver.  We've observed abysmal
> RAID1 recovery times on failing SATA disks because all the time is
> being spent in the driver retrying operations which will never succeed.
> Also, read errors don't tend to fail the array so when the bad disk is
> again accessed for some subsequent read the whole hopeless retry process
> begins anew.

The console was full of errors like:

end_request: I/O error, dev sdb, sector 42644555

I don't know what generates those messages.

As I asked before but never got an answer, is there a way to do timeouts
within the md code so that we are not at the mercy of the lower layer
drivers?

> 
> I posted a patch about 6 weeks ago which attempts to improve this situation
> for RAID1 by telling the driver not to retry on failures and giving some
> weight to read errors for failing the array.  Hopefully, Neil is still
> mulling it over and it or something similar will eventually make it into
> the main line kernel as a solution for this problem.
> --
> Mike Accetta
> 

Thanks,

Alberto
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux