On 17/12/2009 5:02 AM, Gauthier, Dave wrote:
Hi Everyone:
Tomorrow, I will need to present to a group of managers (who know
nothing about DBs) why I chose to use PG over MySQL in a project, MySQL
being the more popular DB choice with other engineers, and managers
fearing things that are “different” (risk). I have a few hard tecnical
reasons (check constraint, deferred constraint checking, array data
type), but I’m looking for a “it’s more reliable” reasons. Again, the
audience is managers. Is there an impartial, 3^rd party evaluation of
the 2 DBs out there that identifies PG as being more reliable? It might
mention things like fewer incidences of corrupt tables/indexes, fewer
deamon crashes, better recovery after system crashes, etc... ?
In all honesty, I don't know if there's much out there in terms of
impartial analysis. Most of it is done by someone with some sort of a
preference that tends to make its self known.
It also depends a _lot_ on what you are doing with the database. What
sorts of data are you storing? How important to you is that data? What
sorts of client workloads do you expect - huge numbers of clients
running frequent simple queries, or fewer clients with big complex
queries? How much data do you expect to store? etc. All these have a
real bearing on database choice, and it's hard to give good answers
without some knowledge of those details.
One thing I'd like to highlight now: when people say "MySQL is faster"
or "Pg is slow" they tend to (a) be referring to very old versions of
Pg, and (b) be using the very fast but very unsafe MyISAM table type in
MySQL, which is great until it eats your data. So beware of speed claims
not backed by very solid configuration details.
Anyway, just to be different let's try to look at why you might choose
MySQL over PostgreSQL, instead of getting all us Pg folks listing why
you should pick Pg. To me, Pg is the default safe and sane choice, and I
need to seek reasons why I might use MySQL instead for a particular
task. So:
*scratches head*
- MySQL is horizontally scalable via clustering and multi-master
replication (though you must beware of numerous gotchas). PostgreSQL can
be used with read-only slaves via Slony/Bucardo/etc replication, but is
limited to a single authoriative master.
(There's work ongoing to enable readonly hot standby slaves with
failover, but no multi-master is on the horizion).
- If you don't care about your data, MySQL used with MyISAM is *crazy*
fast for lots of small simple queries. Big enough apps will still need
something like memcached on top of that, though. If using MySQL+MyISAM
this way you must be prepared to deal with table corruption on
crashes/outages/powerloss, lack of any transactional behaviour, etc.
There's also some bizarre error "handling" they use to avoid aborting a
non-transactional operation on a MyISAM table half-way though, so you
must be very careful to make sure your updates are valid before
attempting them. But.... why not just use memcached over something
somewhat slower but a lot safer? I guess this one isn't a plus.
- It's a cool tool when you want to query and integrate data from all
sorts of disparate sources, thanks to its support for pluggable storage
engines. If you want something for data analysis and integration rather
than safe storage it's well worth looking at.
--
Craig Ringer
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general