On 05/15/2012 02:02 PM, Robert James wrote:
Besides the one time spent CLUSTERing, do I loose anything by doing it
for every table? Does a CLUSTER slow anything down?
It would seem to me that a) a CLUSTER should never have worse
performance than a random order b) may have better performance and c)
has the benefits of a VACUUM and REINDEX. So, I may as well cluster
every table, if only by the primary key.
Cluster requires an exclusive lock - you need a sufficiently long
maintenance window. (But so does vacuum full which is typically slower
and vacuum-full doesn't make shiny new indexes.)
Cluster requires sufficient disk space to create the new copies of the
table.
Cluster only reorganizes data at the time you do the cluster -
additional use after the cluster will cause data to lose its order.
Since you are clustering based on an index you can only choose one
cluster order.
Cluster should have better performance but it depends on the index you
choose relative to the queries you typically run. Let's say that you
have an accounting system where you most often grab the most recent
month worth of data. Clustering that keeps that data together will be
beneficial but you could easily have a different index, item-number for
instance, that would, if used for clustering, leave the commonly used
data scattered throughout the table. If that table was an append-only
detail table the most commonly used data would naturally clump together
which clustering would then destroy.
With the exception of disk-space constraints there are few reasons to
use vacuum-full/reindex instead of cluster.
Cheers,
Steve
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general