RE: upgrade advice

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Jon Nelson [mailto:jnelson-linux-raid@xxxxxxxxxxx]
> Sent: Wednesday, December 17, 2008 8:38 AM
> To: David Lethe
> Cc: Redeeman; Justin Piszcz; linux-raid@xxxxxxxxxxxxxxx
> Subject: Re: upgrade advice
> 
> On Wed, Dec 17, 2008 at 8:28 AM, David Lethe <david@xxxxxxxxxxxx>
> wrote:
> >> -----Original Message-----
> >> From: linux-raid-owner@xxxxxxxxxxxxxxx [mailto:linux-raid-
> >> owner@xxxxxxxxxxxxxxx] On Behalf Of Jon Nelson
> >> Sent: Wednesday, December 17, 2008 7:27 AM
> >> To: Redeeman
> >> Cc: Justin Piszcz; linux-raid@xxxxxxxxxxxxxxx
> >> Subject: Re: upgrade advice
> >>
> >> > What sort of volume of disks are you using, and what loads? (24/7
> >> with
> >> > high load?)
> >>
> >> It's a home server. It's up 24/7. Load probably 80% of the time is
> >> low, the rest of the time it's bursty.
> >>
> 
> > Read the specs on the disks.  Most consumer class drives are rated
> for
> > only 2400 hours annual duty cycle ... so I guess you turn the
> computer
> > off in April? :)
> 
> Yeah. Right. I typically get > 5 years out of the disks. I got almost
> 9 years once, before bad sectors started showing up. Actually, I've
> either gotten less than a week or more than 4 years out of every
> single drive I've ever had, except a really bad batch of seagates I
> got 8-10 years ago.
> 
> The current temp of the drives varies between 28C and 33C (the Hitach
> is warmer by +4C than any other drive).
> 
> > Other differences include number of ECC correction bits, so you will
> > absolutely get more grown bad blocks with cheap drives.
> 
> That's good to know.
> 
> > drives.   No wonder several fail within days of each other, they all
> > have same model, I/O load, and generally same manufacturing batch.
> 
> I never have more than 1 of the same manuf. / model in a raid at a
> time. I have a 3 drive raid10f2 with 3 different manuf.
> 
> > If you are hell-bent on getting cheap drives, then at least factor
in
> > cost of an additional drive so you can implement RAID6, and automate
> a
> > daemon to check/repair consistency often.
> 
> I will likely move to raid6 eventually.
> Thanks for the advice.
> 
> --
> Jon
The duty cycle makes a difference now, but wasn't a design point until
last few years ago when hardware RAID manufacturers reluctantly started
supporting SATA disks. When you were buying disks 5-10 years ago, the
enterprise class disks were SCSI & FC, and consumer drives were
ATA/SATA. They had to charge more and make the ATA disks a little better
to prevent losing too much money to warranty replacements.  The market
now demands clear pricing and performance/reliability differences
between enterprise and consumer class SATA devices.

As for drive temperature ... believe it or not, it is irrelevant when it
comes to drive failures, unless you are pushing the operational
temperature thresholds.  Google published a detailed analysis of drive
failures in their storage farm that included average drive temperatures,
and proved that increased drive temp did not affect failure rate.   (In
fact, the drives running at the lowest temperatures actually had a
slightly higher failure rate).

Anybody who thinks there are no difference between the specs haven't
looked at them deeply enough.  ECC bits;  background media
scanning/repair algorithms; and firmware make all the difference in the
world.  Ask anybody who has worked as a storage architect for a RAID
manufacturer or is a drive test engineer. 

To the untrained eye (or somebody who has never had opportunity to
attend non-disclosure meetings with drive manufacturers), there isn't
much of a difference because they get hung up on easy things like RPMs
and amount of cache .. the stuff they put on the outside of the box.

Error detection, recovery algorithms, extensive log page reporting, and
online/offline firmware diagnostics tend to be ignored. Not only do
people not understand them, but some of the really good stuff isn't
published in the manuals due to intellectual property concerns.  If you
are in the 'biz, and buying thousands or tens of thousands of disks a
month, you become well aware of these things.

"Well, I buy xyz brand disk drives because I had such-and-such
experiences with abc brand disks".   How many times have people said
that??  The same person wouldn't commoditize car, dishwashers, or wine
like this.  People who don't understand a product say such things.   

I should touch on firmware as well.  NCQ, TCQ, severety-1 bugs that can
result in disks locking up or having catastrophic data loss?   They
exist.  I won't break any NDAs, but even a firmware upgrade can have
profound differences in your storage farm.   The firmware update release
notes for many drives would scare the heck out of you, and make you
wonder what motivated ABC company to ever release something, or for that
matter, what motivated ABC company to not make something public and have
a massive recall when certain bugs were found.   We've seen this happen
in industry before with things like Seagate's stiction problem, NCQ
bugs, IBM "deathstars", Hitachi recalls and so on.

OK, off my soapbox ... back to work writing disk diagnostic software for
my OEM customers 

David


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux