Greetings, I'm running PostgreSQL-9.0.4 on a Linux-x86_64 cluster with 1 master, and two streaming replication slaves. Since late yesterday, the load on the server has been noticably higher (5.00+) than normal (generally under 1.00). I investigated, and found that for the past ~18 hours, there's one autovacuum process that has been running, and not making any obvious progress: select procpid,query_start,current_query from pg_stat_activity where current_query LIKE 'autovacuum%' ; procpid | query_start | current_query ---------+-------------------------------+----------------------------------------- 30188 | 2011-11-21 22:42:26.426315-08 | autovacuum: VACUUM public.nppsmoketests (1 row) select c.oid,c.relname,l.pid,l.mode,l.granted from pg_class c join pg_locks l on c.oid=l.relation where l.pid='30188' order by l.pid; oid | relname | pid | mode | granted -------+------------------------------------+-------+--------------------------+--------- 72112 | nppsmoketests | 30188 | ShareUpdateExclusiveLock | t 72617 | nppsmoketests_pkey | 30188 | RowExclusiveLock | t 72619 | nppsmoketests_bug_idx | 30188 | RowExclusiveLock | t 72620 | nppsmoketests_active_idx | 30188 | RowExclusiveLock | t 72621 | nppsmoketests_arch_idx | 30188 | RowExclusiveLock | t 72622 | nppsmoketests_branch_idx | 30188 | RowExclusiveLock | t 72623 | nppsmoketests_current_status_idx | 30188 | RowExclusiveLock | t 72624 | nppsmoketests_build_type_idx | 30188 | RowExclusiveLock | t 72625 | nppsmoketests_gpu_idx | 30188 | RowExclusiveLock | t 72626 | nppsmoketests_os_idx | 30188 | RowExclusiveLock | t 72627 | nppsmoketests_owner_idx | 30188 | RowExclusiveLock | t 72628 | nppsmoketests_regressioncl_idx | 30188 | RowExclusiveLock | t 72629 | nppsmoketests_subtest_idx | 30188 | RowExclusiveLock | t 72630 | nppsmoketests_suiteid_idx | 30188 | RowExclusiveLock | t 72631 | nppsmoketests_suiteid_testname_idx | 30188 | RowExclusiveLock | t 72632 | nppsmoketests_testcl_idx | 30188 | RowExclusiveLock | t 72633 | nppsmoketests_testname_idx | 30188 | RowExclusiveLock | t 80749 | nppsmoketests_osversion_idx | 30188 | RowExclusiveLock | t (18 rows) When I strace PID 30188, I see tons of this scrolling past quickly, but I'm not really sure what it means beyond a 'Timeout' not looking good: select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) lseek(95, 753901568, SEEK_SET) = 753901568 read(95, "\202\1\0\0\260\315\250\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 lseek(95, 753917952, SEEK_SET) = 753917952 read(95, "\202\1\0\0 N\253\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) lseek(95, 768606208, SEEK_SET) = 768606208 read(95, "\204\1\0\0h!~\233\1\0\0\0\230\0\360\20\360\37\4 \0\0\0\0x\237\360\0\0\237\360\0"..., 8192) = 8192 lseek(95, 753934336, SEEK_SET) = 753934336 read(95, "\202\1\0\0 &\275\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) read(95, "\202\1\0\0\10\33\276\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 lseek(95, 753958912, SEEK_SET) = 753958912 read(95, "\202\1\0\0x\317\307\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) lseek(95, 768614400, SEEK_SET) = 768614400 An old thread suggests that this is a stuck spinlock: http://archives.postgresql.org/pgsql-performance/2009-05/msg00455.php I'm using the defaults for all the *vacuum* options in postgresql.conf, except for: log_autovacuum_min_duration = 2500 At this point, I'm not sure what I can safely do to either terminate this autovacuum process, or kick it into making progress again? thanks -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general