Re: Implementing low level timeouts within MD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2007-10-30 at 13:39 -0400, Doug Ledford wrote:
> 
> Really, you've only been bitten by three so far.  Serverworks PATA
> (which I tend to agree with the other person, I would probably chock

3 types of bugs is too many, it basically affected all my customers
with  multi-terabyte arrays. Heck, we can also oversimplify things and 
say that it is really just one type and define everything as kernel type
problems (or as some other kernel used to say... general protection
error).

I am sorry for not having hundreds of RAID servers from which to draw
statistical analysis. As I have clearly stated in the past I am trying
to come up with a list of known combinations that work. I think my
data points are worth something to some people, specially those 
considering SATA drives and software RAID for their file servers. If
you don't consider them important for you that's fine, but please don't
belittle them just because they don't match your needs.

> this up to Serverworks, not PATA), USB storage, and SATA (the SATA stack
> is arranged similar to the SCSI stack with a core library that all the
> drivers use, and then hardware dependent driver modules...I suspect that
> since you got bit on three different hardware versions that you were in
> fact hitting a core library bug, but that's just a suspicion and I could
> well be wrong).  What you haven't tried is any of the SCSI/SAS/FC stuff,
> and generally that's what I've always used and had good things to say
> about.  I've only used SATA for my home systems or workstations, not any
> production servers.

The USB array was never meant to be a full production system, just to 
buy some time until the budget was allocated to buy a real array. Having
said that, the raid code is written to withstand the USB disks getting
disconnected as far as the driver reports it properly. Since it doesn't,
I consider it another case that shows when not to use software RAID
thinking that it will work.

As for SCSI I think it is a greatly proved and reliable technology, I've
dealt with it extensively and have always had great results. I know deal
with it mostly on non Linux based systems. But I don't think it is
affordable to most SMBs that need multi-terabyte arrays.

> 
> > I'll repeat my plea one more time. Is there a published list
> > of tested combinations that respond well to hardware failures
> > and fully signals the md code so that nothing hangs?
> 
> I don't know of one, but like I said, I've not used a lot of the SATA
> stuff for production.  I would make this one suggestion though, SATA is
> still an evolving driver stack to a certain extent, and as such, keeping
> with more current kernels than you have been using is likely to be a big
> factor in whether or not these sorts of things happen.

OK, so based on this it seems that you would not recommend the use
of SATA for production systems due to its immaturity, correct? Keep in
mind that production systems are not able to be brought down just to
keep up with kernel changes. We have some tru64 production servers with
1500 to 2500 days uptime, that's not uncommon in industry.

Alberto

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux