On 27.01.2012 19:43, Jon Nelson wrote:
Let's say I have a 7GB table with 3-4 indices for a total of 10-12GB. Furthermore, let's say I have a machine with sufficient memory for me to set the work_mem and maintenance_work_mem to 20GB (just for this session). When I issue a CLUSTER using one of the indices, I see PostgreSQL (by way of strace) performing an index scan which amounts to large quantities of random I/O. In my case, that means it takes a very, very long time. PostgreSQL is largely at defaults, except for a 2GB shared_buffers and a few unrelated changes. The system itself has 32GB of physical RAM and has plenty free. Why didn't PostgreSQL just read the table into memory (and the interesting index) as a sequential scan, sort, and then write it out? It seems like there would be more than enough memory for that. The sequential I/O rate on this machine is 50-100x the random I/O rate. I'm using 8.4.10 (with the 'inet' de-toasting patch) on Scientific Linux 6.1.
The suppport for doing a seqscan+sort in CLUSTER was introduced in version 9.1. Before that, CLUSTER always did an indexscan. See release notes: http://www.postgresql.org/docs/9.1/static/release-9-1.html#AEN107416
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance