Greetings, I have a 4 server postgresql-9.1.3 cluster (one master doing streaming replication to 3 hot standby servers). All of them are running Fedora-16-x86_64. Last Friday I upgraded the entire cluster from Fedora-15 with postgresql-9.0.6 to Fedora-16 with postgresql-9.1.3. I made no changes to postgresql.conf following the upgrade. I used pg_upgrade on the master to upgrade it, followed by blowing away $PGDATA on all the standbys and rsyncing them fresh from the master. All of the servers have 128GB RAM, and at least 16 CPU cores. Everything appeared to be working fine until last night when the load on the master suddenly took off, and hovered at around 30.00 ever since. Prior to the load spike, the load was hovering around 2.00 (which is actually lower than it was averaging prior to the upgrade when it was often around 4.00). When I got in this morning, I found an autovacuum process that had been running since just before the load spiked, and the pg_dump cronjob that started shortly after the load spike (and normally completes in about 20 minutes for all the databases) was still running, and hadn't finished the first of the 6 databases. I ended up killing the pg_dump process altogether in the hope that it might unblock whatever was causing the high load. Unfortunately that didn't help, and the load continued to run high. I proceeded to check dmesg, /var/log/messages and the postgresql server log (all on the master), but I didn't spot anything out of the ordinary, definitely nothing that pointed to a potential explanation for all of the high load. I inspected what the autovacuum process was doing, and determined that it was chewing away on the largest table (nearly 98 million rows) in the largest database. It was making very slow progress, at least I believe that was the case, as when I attached strace to the process, the seek addresses were changing in a random fashion. Here are the current autovacuum settings: autovacuum | on autovacuum_analyze_scale_factor | 0.1 autovacuum_analyze_threshold | 50 autovacuum_freeze_max_age | 200000000 autovacuum_max_workers | 4 autovacuum_naptime | 1min autovacuum_vacuum_cost_delay | 20ms autovacuum_vacuum_cost_limit | -1 autovacuum_vacuum_scale_factor | 0.2 autovacuum_vacuum_threshold | 50 Did something significant change in 9.1 that would impact autovacuum behavior? I'm at a complete loss on how to debug this, since I'm using the exact same settings now as prior to the upgrade. thanks -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general