Re: Notes on converting from MySQL 5.0.x to PostgreSQL

Jan Wieck <JanWieck@xxxxxxxxx> · Mon, 10 Jul 2006 19:41:09 -0400

On 6/30/2006 11:12 AM, Scott Marlowe wrote:
I agree with Tom, nice notes.  I noted a few minor issues that seem to
derive from a familiarity with MySQL.  I'll put my corrections below...

On Fri, 2006-06-30 at 08:17, Jason McManus wrote:
On Converting from MySQL 5.0.x to PostgreSQL 8.1.x
--------------------------------------------------
Major differences I have noted:
-------------------------------

MySQL 5.0.x:

* Easy, built-in and extensive replication support.

Not sure how extensive it is.  It's basically synchronous single master
single slave, right?  It is quite easy though.

Last thing I heard was that MySQL still had only statement based 
replication and that it doesn't work together with some of the new 
enterprise features like triggers and stored procedures. Row level 
replication is on their TODO list and this major feature will probably 
appear in some minor 5.2.x release.

Jan

PostgreSQL 8.1.x:
* Embedded procedures in multiple native languages (stored procedures and
  functions can be written in native Perl, Python, Tcl, SQL, PL/PgSQL)

Note that there are a dozen or more other languages as well.  Just FYI. 
Off the top of my head, plPHP, plJ (java there's two different java
implementations, I think) and plR (R is the open source equivalent of
the S statistics language)  

* Replication support still rudimentary.

Hmmmm.  I think that's an overly simplistic evaluation.  The slony
replication engine is actually VERY advanced, but the administrative
tools consist mostly of "your brain".  hehe.  That said, once you've
learned how to drive it, it's quite amazing.  Keep in mind, slony can be
applied to a living database while it's running, and can run between
different major versions of postgresql.  That's a pretty advanced
feature.  Plus, if the replication daemons die (kill -9ed or whatever)
you can restart replication and slony will come right back where it was
and catch up.

Pointers, tips, quick facts and gotchas for other people converting:
--------------------------------------------------------------------

* MySQL combines the concepts of 'database' and 'schema' into one.  PostgreSQL
  differentiates the two.  While the hierarchy in MySQL is
  database.table.field, PostgreSQL is roughly: database.schema.table.field.
  A schema is a 'logically grouped set of tables but still kept within a
  particular database.'  This could allow separate applications to be built
  that still rely upon the same database, but can be kept somewhat logically
  separated.  The default schema in each database is called 'public', and is
  the one referred to if no others are specified.  This can be modified with
  'SET search_path TO ...'.

This is a VERY good analysis of the difference between the two
databases.

* Pg uses a 'template1' pseudo-database that can be tailored to provide
  default objects for new database creation, if you should desire.  It
  obviously also offers a 'template0' database that is read-only and
  offers a barebones database, more equivalent to the empty db created with
  mysql's CREATE DATABASE statement.

This isn't quite right.

template0 is a locked and "pure" copy of the template database.  It's
there for "break glass in case of emergency" use. :)

template1, when you first initdb, is exactly the same as template0, but
you can connect to it, and alter it.  Both of these are "real"
postgresql databases.  template1 is the database that gets copied by
default when you do "create database".  Note that you can also define a
different template database when running create database, which lets you
easily clone any database on your machine.  "create database newdb with
template olddb"

* Pg uses the 'serial' column type instead of AUTO_INCREMENT.  This allows
  more than one independent sequence to be specified per table (though the
  utility of this may be of dubious value).  These are closer to Oracle's
  concept of sequence generators, and they can be manipulated with the
  currval(), nextval(), setval(), and lastval() functions.

Don't forget 64bit bigserials too.

* Pg requires its tables and databases be 'vacuumed' regularly to remove
  completed transaction snapshots and optimize the tables on disk.  It is
  necessary because the way that PostgreSQL implements true MVCC is by
  writing all temporary transactions to disk and setting a visibility
  flag for the record.  Vacuuming can be performed automatically, and in
  a deferred manner by using vacuum_cost settings to limit it to low-load
  periods or based upon numerous other criteria.  See the manual for more
  information.

Interestingly enough, MySQL's innodb tables do almost the exact same
thing, but their vacuum process is wholly automated.  Generally, this
means fewer issues pop up for the new dba, but when they do, they can be
a little harder to deal with.  It's about a wash.  Of course, as you
mentioned earlier, most mysql folks aren't using innodb.

* While MySQL supports transactions with the InnoDB databases, many MySQL
  users generally do not use them extensively enough.  With Pg, due to the
  behaviour of the server in attempting to ensure data integrity in a
  variety of situations (client disconnection, network trouble, server
  crashes, etc.), it is highly advisable to become familiar and utilize
  transactions a lot more, to ensure your DATA is left in a consistent state
  before and after every change you wish to make.

A point you might want to throw in here is that EVERYTHING in postgresql
is a transaction.  If you don't issue a begin statement, then postgresql
runs each statement you type in inside its own transaction.

This means that inserting 10,000 rows without wrapping them inside an
explicit transaction results in 10,000 individual transactions.

However, the more interesting thing here, is that every statement,
including DDL is transactable, except for a couple of big odd ones, like
create database.  So, in postgresql, you can do:

begin;
create table xyz...
alter table abc...
insert into abc select * from iii
update iii...;
drop table iii;
(oops, I messed up something)
rollback;

and there's no change and no lost data.  Quite impressive actually.

Common equivalents:
-------------------

MySQL                           PostgreSQL
-----                           -----------
OPTIMIZE TABLE ...              VACUUM ...

vacuum and analyze for optimize I think.  Also, possibly reindex,
although nominally that's the "sledge hammer" of optimization.

One last thing I'd mention that I REALLY like about PostgreSQL over any
other database I've used is that the psql interface has a complete
syntax lookup feature that is WAY cool.  \h brings it up, and \h COMMAND
where COMMAND is the command you want to look up will bring up the
syntax for your command.  

And, I hate the fact that CTRL-C in the mysql command line tool exits
the tool instead of interrupting the current query.  In PostgreSQL it
interrupts the current query.  CTRL-\ will kill the client if you need
to.

Overall, a great review.  Thanks.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@xxxxxxxxx #