On May 9, 2006, at 2:16 AM, Hannes Dorbath wrote:
Hi,
I've just had some discussion with colleagues regarding the usage
of hardware or software raid 1/10 for our linux based database
servers.
I myself can't see much reason to spend $500 on high end controller
cards for a simple Raid 1.
Any arguments pro or contra would be desirable.
From my experience and what I've read here:
+ Hardware Raids might be a bit easier to manage, if you never
spend a few hours to learn Software Raid Tools.
+ There are situations in which Software Raids are faster, as CPU
power has advanced dramatically in the last years and even high end
controller cards cannot keep up with that.
+ Using SATA drives is always a bit of risk, as some drives are
lying about whether they are caching or not.
Don't buy those drives. That's unrelated to whether you use hardware
or software RAID.
+ Using hardware controllers, the array becomes locked to a
particular vendor. You can't switch controller vendors as the array
meta information is stored proprietary. In case the Raid is broken
to a level the controller can't recover automatically this might
complicate manual recovery by specialists.
Yes. Fortunately we're using the RAID for database work, rather than
file
storage, so we can use all the nice postgresql features for backing up
and replicating the data elsewhere, which avoids most of this issue.
+ Even battery backed controllers can't guarantee that data written
to the drives is consistent after a power outage, neither that the
drive does not corrupt something during the involuntary shutdown /
power irregularities. (This is theoretical as any server will be
UPS backed)
fsync of WAL log.
If you have a battery backed writeback cache then you can get the
reliability
of fsyncing the WAL for every transaction, and the performance of not
needing
to hit the disk for every transaction.
Also, if you're not doing that you'll need to dedicate a pair of
spindles to the
WAL log if you want to get good performance, so that there'll be no
seeking
on the WAL. With a writeback cache you can put the WAL on the same
spindles
as the database and not lose much, if anything, in the way of
performance.
If that saves you the cost of two additional spindles, and the space
on your
drive shelf for them, you've just paid for a reasonably proced RAID
controller.
Given those advantages... I can't imagine speccing a large system
that didn't
have a battery-backed write-back cache in it. My dev systems mostly use
software RAID, if they use RAID at all. But my production boxes all
use SATA
RAID (and I tell my customers to use controllers with BB cache,
whether it
be SCSI or SATA).
My usual workloads are write-heavy. If yours are read-heavy that will
move the sweet spot around significantly, and I can easily imagine that
for a read-heavy load software RAID might be a much better match.
Cheers,
Steve