Question about clustering indexes and restores

Harold A. Giménez Ch. <harold.gimenez@xxxxxxxxx> · Thu, 22 Jan 2009 14:52:12 -0500

Hi list,

Clustering my indexes dramatically improves the query performance of many of my queries. Also, the actual clustering takes a very long time for big databases, roughly 20 hours. I have two questions about how to improve this:

1. I've tweaked maintenance_mem_max and effective_cache_size to a point where the cluster operation uses a good chunk of my physical RAM, and the OS does not start swapping. Is there any other parameter I should look at?

2. Reading the documentation for cluster at
http://www.postgresql.org/docs/8.3/interactive/sql-cluster.html, I see
that all clustering does is reorder the data on disk to 'match' the
order of the clustered index. My question is, if I dump a clustered
database using pg_dump in custom format, is it necessary to cluster
after restoring it? Or does a dump/restore not guarantee that the order
of the data restored is the same as the original dumped database?

3. Somewhat related to #2, what is the best way to move data from a staging database on one
server, to the production environment on a different server? I've been
using pg_dump/pg_restore, but there must be a better way...

Thanks for any pointers,

-Harold