In summary, it seems one of these is true: 1. Drive manufacturers don't design server drives to be more reliable than consumer drive 2. Drive manufacturers _do_ design server drives to be more reliable than consumer drive, but the design doesn't yield significantly better reliability. 3. Server drives are significantly more reliable than consumer drives. --------------------------------------------------------------------------- Scott Marlowe wrote: > On Thu, 2007-04-05 at 23:37, Greg Smith wrote: > > On Thu, 5 Apr 2007, Scott Marlowe wrote: > > > > > On Thu, 2007-04-05 at 14:30, James Mansion wrote: > > >> Can you cite any statistical evidence for this? > > > Logic? > > > > OK, everyone who hasn't already needs to read the Google and CMU papers. > > I'll even provide links for you: > > > > http://www.cs.cmu.edu/~bianca/fast07.pdf > > http://labs.google.com/papers/disk_failures.pdf > > > > There are several things their data suggests that are completely at odds > > with the lore suggested by traditional logic-based thinking in this area. > > Section 3.4 of Google's paper basically disproves that "mechanical devices > > have decreasing MTBF when run in hotter environments" applies to hard > > drives in the normal range they're operated in. > > On the google: > > The google study ONLY looked at consumer grade drives. It did not > compare them to server class drives. > > This is only true when the temperature is fairly low. Note that the > drive temperatures in the google study are <=55C. If the drive temp is > below 55C, then the environment, by extension, must be lower than that > by some fair bit, likely 10-15C, since the drive is a heat source, and > the environment the heat sink. So, the environment here is likely in > the 35C range. > > Most server drives are rated for 55-60C environmental temperature > operation, which means the drive would be even hotter. > > As for the CMU study: > > It didn't expressly compare server to consumer grade hard drives. > Remember, there are server class SATA drives, and there were (once upon > a time) consumer class SCSI drives. If they had separated out the > drives by server / consumer grade I think the study would have been more > interesting. But we just don't know from that study. > > Personal Experience: > > In my last job we had three very large storage arrays (big black > refrigerator looking boxes, you know the kind.) Each one had somewhere > in the range of 150 or so drives in it. The first two we purchased were > based on 9Gig server class SCSI drives. The third, and newer one, was > based on commodity IDE drives. I'm not sure of the size, but I believe > they were somewhere around 20Gigs or so. So, this was 5 or so years > ago, not recently. > > We had a cooling failure in our hosting center, and the internal > temperature of the data center rose to about 110F to 120F (43C to 48C). > We ran at that temperature for about 12 hours, before we got a > refrigerator on a flatbed brought in (btw, I highly recommend Aggreko if > you need large scale portable air conditioners or generators) to cool > things down. > > In the months that followed the drives in the IDE based storage array > failed by the dozens. We eventually replaced ALL the drives in that > storage array because of the failure rate. The SCSI based arrays had a > few extra drives fail than usual, but nothing too shocking. > > Now, maybe now Seagate et. al. are making their consumer grade drives > from yesterday's server grade technology, but 5 or 6 years ago that was > not the case from what I saw. > > > Your comments about > > server hard drives being rated to higher temperatures is helpful, but > > conclusions drawn from just thinking about something I don't trust when > > they conflict with statistics to the contrary. > > Actually, as I looked up some more data on this, I found it interesting > that 5 to 10 years ago, consumer grade drives were rated for 35C > environments, while today consumer grade drives seem to be rated to 55C > or 60C. Same as server drives were 5 to 10 years ago. I do think that > server grade drive tech has been migrating into the consumer realm over > time. I can imagine that today's high performance game / home systems > with their heat generating video cards and tendency towards RAID1 / > RAID0 drive setups are pushing the drive manufacturers to improve > reliability of consumer disk drives. > > > The main thing I wish they'd published is breaking some of the statistics > > down by drive manufacturer. For example, they suggest a significant > > number of drive failures were not predicted by SMART. I've seen plenty of > > drives where the SMART reporting was spotty at best (yes, I'm talking > > about you, Maxtor) and wouldn't be surprised that they were quiet right up > > to their bitter (and frequent) end. I'm not sure how that factor may have > > skewed this particular bit of data. > > I too have pretty much given up on Maxtor drives and things like SMART > or sleep mode, or just plain working properly. > > In recent months, we had an AC unit fail here at work, and we have two > drive manufacturers for our servers. Manufacturer F and S. The drives > from F failed at a much higher rate, and developed lots and lots of bad > sectors, the drives from manufacturer S, OTOH, have not had an increased > failure rate. While both manufacturers claim that their drives can > survive in an environment of 55/60C, I'm pretty sure one of them was > lying. We are silently replacing the failed drives with drives from > manufacturer S. > > Based on experience I think that on average server drives are more > reliable than consumer grade drives, and can take more punishment. But, > the variables of manufacturer, model, and the batch often make even more > difference than grade. > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <bruce@xxxxxxxxxx> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +