At 11:13 PM 4/7/2007, david@xxxxxxx wrote:
On Sat, 7 Apr 2007, Ron wrote:
Ron, I think that many people aren't saying cheap==good, what we are
doing is arguing against the idea that expesnsive==good (and it's
coorelary cheap==bad)
Since the buying decision is binary, you either buy high quality HDs
or you don't, the distinction between the two statements makes no
difference ITRW and therefore is meaningless. "The difference that
makes no difference =is= no difference."
The bottom line here is that no matter how it is "spun", people are
using the Google and the CMU studies to consider justifying reducing
the quality of the HDs they buy in order to reduce costs.
Frankly, they would be better advised to directly attack price
gouging by certain large vendors instead; but that is perceived as a
harder problem. So instead they are considering what is essentially
an example of Programming by Side Effect.
Every SW professional on this list has been taught how bad a strategy
that usually is.
My biggest concern is that something I've seen over and over again
in my career will happen again:
People tend to jump at the _slightest_ excuse to believe a story
that will save them short term money and resist even _strong_
reasons to pay up front for quality. Even if paying more up front
would lower their lifetime TCO.
on the other hand, it's easy for people to blow $bigbucks with this
argument with no significant reduction in their maintinance costs.
No argument there. My comments on people putting up with price
gouging should make clear my position on overspending.
The Google and CMU studies are =not= based on data drawn from
businesses where the lesser consequences of an outage are losing
$10Ks or $100K per minute... ...and where the greater consequences
include the chance of loss of human life.
Nor are they based on businesses that must rely exclusively on
highly skilled and therefore expensive labor.
hmm, I didn't see the CMU study document what businesses it used.
Section 2.3: Data Sources, p3-4.
3 HPC clusters, each described as "The applications running on this
system are typically large-scale scientific simulations or
visualization applications. +
3 ISPs, 1 HW failure log, 1 warranty service log of hardware
failures, and 1 exclusively FC HD set based on 4 different kinds of FC HDs.
In the case of the CMU study, people are even extrapolating an
economic conclusion the original author did not even make or intend!
Is it any wonder I'm expressing concern regarding inappropriate
extrapolation of those studies?
I missed the posts where people were extrapolating economic
conclusions, what I saw was people stateing that 'you better buy the
SCSI drives as they are more reliable', and other people pointing
out that recent studies indicate that there's not a significant
difference in drive reliability between the two types of drives
The original poster asked a simple question regarding 8 SCSI HDs vs
24 SATA HDs. That question was answered definitively some posts ago
(use 24 SATA HDs).
Once this thread started talking about the Google and CMU studies, it
expanded beyond the OPs original SCSI vs SATA question.
(else why are we including FC and other issues in our considerations
as in the CMU study?)
We seem to have evolved to
"Does paying more for enterprise class HDs vs consumer class HDs
result in enough of a quality difference to be worth it?"
To analyze that question, the only two HD metrics that should be
considered are
1= whether the vendor rates the HD as "enterprise" or not, and
2= the length of the warranty on the HD in question.
Otherwise, one risks clouding the analysis due to the costs of the
interface used.
(there are plenty of non HD metrics that need to be considered to
examine the issue properly.)
The CMU study was not examining any economic issue, and therefore to
draw an economic conclusion from it is questionable.
The CMU study was about whether the industry standard failure model
matched empirical historical evidence.
Using the CMU study for any other purpose risks misjudgment.
Let's pretend =You= get to build Citibank's or Humana's next
mission critical production DBMS using exclusively HDs with 1 year warranties.
(never would be allowed ITRW)
who is arguing that you should use drives with 1 year warranties? in
case you blinked consumer drive warranties are backup to 5 years.
As Josh Drake has since posted, they are not (although TBF most seem
to be greater than 1 year at this point).
So can I safely assume that we have agreement that you would not
advise using HDs with less than 5 year warranties for any DBMS?
If so, the only debate point left is whether there is a meaningful
distinction between HDs rated as "enterprise class" vs others by the
same vendor within the same generation.
Even if you RAID 6 them, I'll bet you anything that a system with
32+ HDs on it is likely enough to spend a high enough percentage of
its time operating in degraded mode that you are likely to be
looking for a job as a consequence of such a decision.
...and if you actually suffer data loss or, worse, data corruption,
that's a Career Killing Move.
(and it should be given the likely consequences to the public of
such a F* up).
so now it's "nobody got fired for buying SCSI?"
|
Again, we are way past simply SCSI vs SATA interfaces issues and well
into more fundamental issues of HD quality and price.
Let's bear in mind that SCSI is =a legacy technology=. Seagate will
cease making all SCSI HDs in 2007. The SCSI standard has been
stagnant and obsolescent for years. Frankly, the failure of the FC
vendors to come out with 10Gb FC in a timely fashion has probably
killed that interface as well.
The future is most likely SATA vs SAS. =Those= are most likely the
relevant long-term technologies in this discussion.
frankly, I think that a lot of the cost comes from the simple fact
that they use smaller SCSI drives (most of them haven't starting
useing 300G drives yet), and so they end up needing ~5x more drive
bays, power, cooling, cableing, ports on the controllers, etc. if
you need 5x the number of drives and they each cost 3x as much, you
are already up to 15x price multiplier, going from there to 50x is
only adding another 3x multiplier (which with the extra complexity
of everything is easy to see, and almost seems reasonable)
|
Well, be prepared to re-examine this issue when you have to consider
using 2.5" 73GB SAS HDs vs using 3.5" >= 500GB SATA HDs.
For OLTP-like workloads, there is a high likelihood that solutions
involving more spindles are going to be better than those involving
fewer spindles.
Reliability isn't the only metric of consideration here. If
organizations have to go certain routes to meet their business goals,
their choices are legitimately constrained.
(I recall being asked for a 10TB OLAP system 7 years ago and telling
the client for that point in time the only DBMS products that could
be trusted with that task were DB2 and Oracle: an answer the M$
favoring CEO of the client did !not! like.)
if I had the money to waste, I would love to see someone open the
'consumer grade' seagate Barracuda 7200.10 750G drive along with a
'enterprise grade' seagate Barracuda ES 750G drive (both of which
have 5 year warranties) to see if there is still the same 'dramatic
difference' between consumer and enterprise drives that there used to be.
it would also be interesting to compare the high-end scsi drives
with the recent SATA/IDE drives. I'll have to look and see if I can
catch some dead drives before they get destroyed and open them up.
I have to admit I haven't done this experiment in a few years
either. When I did, there always was a notable difference (in
keeping with the vendor's claims as such)
This thread is not about whether there is a difference worthy of note
between anecdotal opinion, professional advice, and the results of studies.
I've made my POV clear on that topic and if there is to be a more
thorough analysis or discussion of it, it properly belongs in another thread.
Cheers,
Ron Peacetree