Re: 8.0.3 pg_autovacuum doesn't clear out stats table?

Robin Iddon <robin@xxxxxxxxxx> · Thu, 23 Feb 2006 23:34:22 +0000

Tom,

Tom Lane wrote:

Quite incorrect --- if that were so, there would be no dead tuples
for vacuum to remove, hm?  Actually ANALYZE does an
update-or-insert-if-not-present fandango.

Thanks for the correction - I was basing my comments on the fact that 
the tuple count was just climbing for the pg_statistics table, but I did 
wonder why the system would do that deliberately - answer is, it doesn't!.

I'm sort of mystified by your report.  I can think of several ways that
the system might fail to notice that pg_statistic needs vacuuming, but
none of them seem to actually occur in the current code, and I don't see
any relevant difference in the CVS logs (I admit to being too lazy to
build an exact 8.0.3 version to test --- but 8.0.7 seems fine).
Can you keep an eye on pg_statistic's entry in the pg_stat_all_tables

view for each affected database, and confirm whether the n_tup_upd/
n_tup_del counts are rising over time or not?  That would at least
narrow down the problem a bit.

As a baseline I see:

imes=# select * from pg_stat_all_tables where relname = 'pg_statistic';
relid | schemaname |   relname    | seq_scan | seq_tup_read | idx_scan 
| idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del
-------+------------+--------------+----------+--------------+----------+---------------+-----------+-----------+-----------
16408 | pg_catalog | pg_statistic |        0 |            0 |  4421694 
|       4273043 |         0 |   2120401 |         0
(1 row)

Now I do:

imes=# analyze;
ANALYZE
imes=# select * from pg_stat_all_tables where relname = 'pg_statistic';
relid | schemaname |   relname    | seq_scan | seq_tup_read | idx_scan 
| idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del
-------+------------+--------------+----------+--------------+----------+---------------+-----------+-----------+-----------
16408 | pg_catalog | pg_statistic |        0 |            0 |  4423073 
|       4274484 |         0 |   2121631 |         0
(1 row)

From that I see that the pg_statistics table is never inserted/deleted 
only updated (ins/del counts are 0), and in principle simply analyzing 
doesn't increase the tuple count.

Yet the reltuples in the pg_class table grows, as does the disk space 
used for the pg_statistics table.  What makes it grow?  Is it running 
queries while analyze is running causing new row versions to be created 
and then not getting cleaned up?

I wonder if Matt's suggestion on the vacuum scale is not in fact the 
real issue here.  Because my scale is set to 2 (the default) but the 
number of inserts is always zero (unless we create new tables/indexes, I 
think the number of active stats stays the same, no?) and the number of 
updates is at most equal to the number of tuples, we can never get 
enough changes to trigger a vacuum (which would require 2x the number of 
updates).  Now any removable rows won't ever get removed.

In fact I just noticed that the number of stats tuples just climbed from 
1236 to 2634.  The ins/del counts are still zero.  I ran analyze and the 
update counter went up only by 1232.  For pg_autovacuum to vacuum this 
table I need (2*2634)+1000 = 6268 updates, which is never going to happen.

So I'm going to run with a fractional value of scale for a while and see 
if it makes things any better.

Thanks for all the help,
Robin