The autovacuum daemon currently uses the number of inserted and updated tuples to determine if it should run VACUUM ANALYZE on a table. Why doesn’t it consider deleted tuples as well? For example, I have a table which gets initially loaded with several million records. A batch process grabs the records 100 at a time, does some processing and deletes them from the table in the order of the primary key. Eventually, performance degrades because an autoanalyze is never run. The planner decides that it should do a sequential scan instead of an index scan because the stats don't reflect reality. See example below. I can set up a cron job to run the ANALYZE manually, but it seems like the autovacuum daemon should be smart enough to figure this out on its own. Deletes can have as big an impact on the stats as inserts and updates. Joe Miller --------------------------- testdb=# \d test Table "public.test" Column | Type | Modifiers --------+---------+----------- id | integer | not null data | bytea | Indexes: "test_pkey" PRIMARY KEY, btree (id) testdb=# insert into public.test select s.a, gen_random_bytes(256) from generate_series(1,10000000) as s(a); INSERT 0 10000000 testdb=# SELECT * FROM pg_stat_all_tables WHERE schemaname='public' AND relname='test'; relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze ---------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------+-----------------+--------------+------------------ 5608158 | public | test | 1 | 0 | 0 | 0 | 10000000 | 0 | 0 | 0 | 0 | 0 | | | | 2010-09-20 10:46:37.283775-04 (1 row) testdb=# explain analyze delete from public.test where id <= 100; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Index Scan using test_pkey on test (cost=0.00..71.63 rows=1000 width=6) (actual time=13.251..22.916 rows=100 loops=1) Index Cond: (id <= 100) Total runtime: 23.271 ms (3 rows) { delete records ad nauseum } testdb=# explain analyze delete from public.test where id <= 7978800; QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Seq Scan on test (cost=0.00..410106.17 rows=2538412 width=6) (actual time=48771.772..49681.562 rows=100 loops=1) Filter: (id <= 7978800) Total runtime: 49682.006 ms (3 rows) testdb=# SELECT * FROM pg_stat_all_tables WHERE schemaname='public' AND relname='test'; relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze ---------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------+-------------------------------+--------------+------------------------------- 5608158 | public | test | 1 | 0 | 54345 | 5433206 | 10000000 | 0 | 5433200 | 0 | 5459506 | 725300 | | 2010-09-20 14:45:54.757611-04 | | 2010-09-20 10:46:37.283775-04 -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance