On Thu, Apr 27, 2006 at 08:57:51AM -0400, Ketema Harris wrote:
OK. My thought process was that having non local storage as say a big raid
5 san ( I am talking 5 TB with expansion capability up to 10 )
That's two disk trays for a cheap slow array. (Versus a more expensive
solution with more spindles and better seek performance.)
would allow
me to have redundancy, expandability, and hopefully still retain decent
performance from the db. I also would hopefully then not have to do
periodic backups from the db server to some other type of storage.
No, backups are completely unrelated to your storage type; you need them
either way. On a SAN you can use a SAN backup solution to back multiple
systems up with a single backup unit without involving the host CPUs.
This is fairly useless if you aren't amortizing the cost over a large
environment.
Is this not a good idea?
It really depends on what you're hoping to get. As described, it's not
clear. (I don't know what you mean by "redundancy, expandability" or
"decent performance".)
How bad of a performance hit are we talking about?
Way too many factors for an easy answer. Consider the case of NAS vs
SCSI direct attach storage. You're probably in that case comparing a
single 125MB/s (peak) gigabit ethernet channel to (potentially several)
320MB/s (peak) SCSI channels. With a high-end NAS you might get 120MB/s
off that GBE. With a (more realistic) mid-range unit you're more likely
to get 40-60MB/s. Getting 200MB/s off the SCSI channel isn't a stretch,
and you can fairly easily stripe across multiple SCSI channels. (You can
also bond multiple GBEs, but then your cost & complexity start going way
up, and you're never going to scale as well.) If you have an environment
where you're doing a lot of sequential scans it isn't even a contest.
You can also substitute SATA for SCSI, etc.
For a FC SAN the peformance numbers are a lot better, but the costs &
complexity are a lot higher. An iSCSI SAN is somewhere in the middle.
Also, in regards to the commit data integrity, as far as the db is
concerned once the data is sent to the san or nas isn't it "written"?
The storage may have that write in cache, but from my reading and
understanding of how these various storage devices work that is how
they keep up performance.
Depends on the configuration, but yes, most should be able to report
back a "write" once the data is in a non-volatile cache. You can do the
same with a direct-attached array and eliminate the latency inherent in
accessing the remote storage.
I would expect my bottleneck if any to be the actual Ethernet transfer
to the storage, and I am going to try and compensate for that with a
full gigabit backbone.
see above.
The advantages of a NAS or SAN are in things you haven't really touched
on. Is the filesystem going to be accessed by several systems? Do you
need the ability to do snapshots? (You may be able to do this with
direct-attach also, but doing it on a remote storage device tends to be
simpler.) Do you want to share one big, expensive, reliable unit between
multiple systems? Will you be doing failover? (Note that failover
requires software to go with the hardware, and can be done in a
different way with local storage also.) In some environments the answers
to those questions are yes, and the price premium & performance
implications are well worth it. For a single DB server the answer is
almost certainly "no".
Mike Stone