Tim Allen wrote: > We have a customer who are having performance problems. They have a > large (36G+) postgres 8.1.3 database installed on an 8-way opteron with > 8G RAM, attached to an EMC SAN via fibre-channel (I don't have details > of the EMC SAN model, or the type of fibre-channel card at the moment). > They're running RedHat ES3 (which means a 2.4.something Linux kernel). > > They are unhappy about their query performance. We've been doing various > things to try to work out what we can do. One thing that has been > apparent is that autovacuum has not been able to keep the database > sufficiently tamed. A pg_dump/pg_restore cycle reduced the total > database size from 81G to 36G. Performing the restore took about 23 hours. Hi Tim! to give you some comparision - we have a similiar sized database here (~38GB after a fresh restore and ~76GB after some months into production). the server is a 4 core Opteron @2,4Ghz with 16GB RAM, connected via 2 QLogic 2Gbit HBA's to the SAN (IBM DS4300 Turbo). It took us quite a while to get this combination up to speed but a full dump&restore cycle (via a pg_dump | psql pipe over the net) now takes only about an hour. 23 hours or even 5 hours sounds really excessive - I'm wondering about some basic issues with the SAN. If you are using any kind of multipathing (most likely the one in the QLA-drivers) I would at first assume that you are playing ping-pong between the controllers (ie the FC-cards do send IO to more than one SAN-head causing those to failover constantly completely destroying performance). ES3 is rather old too and I don't think that even their hacked up kernel is very good at driving a large Opteron SMP box (2.6 should be MUCH better in that regard). Other than that - how well is your postgresql instance tuned to your hardware ? Stefan