What indexes exist? Are the updates to indexed columns?Each table has 4 indices. The updates are to the indexed columns.
Which day's tuples are deleted every night? Is it the day a week ago, or a month ago, or something else?
If only 50,000 get inserted daily and 1.2M get deleted, this table will soon be empty! I think you said daily when you meant hourly somewhere in there.
Why do you have a 4 minute timeout? That seems counter-productive.
You delete a bunch of tuples every night, so of course a vacuum full after that is going to return a lot of space. But that space is probably just going to be needed again the next day. If you don't do the vacuum full, does the *peak* space keep increasing, or does it stabilize?
managed_target_stats=# select * from pgstattuple('xyz');
table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
------------+-------------+------------+---------------+------------------+----------------+--------------------+------------+--------------
5642526720 | 18880283 | 4042874387 | 71.65 | 122037 | 21495560 | 0.38 | 1239598044 | 21.97
This is one of Tables1 tables and this is after running for a week or so with the default autovacuum settings. The dead_tuple_percent look good. But the free_percent looks high. Is this normal? Also when I enabled autovacuum logs, I saw the autovacuums triggering very rarely. May be that's the reason for too much free space? I am going to try with the following settings:
log_autovacuum_min_duration = 0
autovacuum_vacuum_scale_factor = 0
autovacuum_vacuum_threshold = 40000
autovacuum_vacuum_cost_delay = 10ms
autovacuum_max_workers = 5
autovacuum_analyze_scale_factor = 0
autovacuum_analyze_threshold = 40000
Do these settings look ok? I am setting autovacuum_vacuum_scale_factor and autovacuum_analyze_threshold to 0 and the thresholds to a constant 40000. My reasoning is updates happen in a burst every 5 minutes and the upper bound of updates is 50000 every 5 minutes. So I have just harcoded a threshold of 40000. Also I am leaving autovacuum_naptime at the default of 1 min because updates anyways happen only every 5 minutes at the max. So I don't see a point in running autovacuum more frequently than 1min. Correct me if my settings look wrong.
Thanks,
Elan.
On Tue, Apr 29, 2014 at 4:59 PM, Elanchezhiyan Elango <elanelango@xxxxxxxxx> wrote:
Hi,I need help on deciding my vacuuming strategy. I need to know if I ever need to do 'vacuum full' for my tables.Tables1: Following is the query patterns on 4 high traffic table in my database:1. Every 5 minutes about 50000 rows in the table are updated. And for a given clock hour the same 50000 rows are updated again and again.2. When a new hour begins, another 50000 rows get inserted and they get updated every 5 minutes.3. Every night a days worth of old stats are deleted. So this would be 24 * 50000 = 1.2M records that get deleted every night.What indexes exist? Are the updates to indexed columns? Which day's tuples are deleted every night? Is it the day a week ago, or a month ago, or something else?Tables3: Another 4 tables have the following pattern:1. Every 1 hour 50000 rows get updated. For a given day the same 50000 rows are updated again and again.2. When a new day begins, another 50000 rows get inserted and they get updated every hour.3. Every night 1.2M records get deleted.If only 50,000 get inserted daily and 1.2M get deleted, this table will soon be empty! I think you said daily when you meant hourly somewhere in there.Tables4: Another 39 tables have the following pattern:1. Every 1 hour 2000 rows gets updated. For a given day the same 2000 rows are updated again and again.2. When a new day begins, another 2000 rows get inserted and they get updated every hour.3. Every night 48000 rows get deleted.With the above query pattern with intensive updates and deletes, I need to do some aggressive vacuuming.Current strategy:I am running with default autovacuum settings (postgres 9.1.9) and I tried doing a 'vacuum full' for the 8 high traffic tables (Tables1 and Tables3) every night. But after a point, the 'vacuum full's started timing out (with 4min timeout) every night.Why do you have a 4 minute timeout? That seems counter-productive.I think this is because the table is growing bigger (~5GB) and doing a vacuum full every night is probably not feasible.It is probably not necessary, but it certainly seems feasible. 4 min * 8 tables = 32 minutes. Call it one hour, since the 4 minute timeout has started not being enough. Is the 5GB for the table plus indexes, or just the table itself?Going with the default autovacuum settings and not doing 'vacuum full' at all is also not enough for my usecase. Whenever vacuum full succeeded every night, it did seem to reclaim a considerable amount of space.You delete a bunch of tuples every night, so of course a vacuum full after that is going to return a lot of space. But that space is probably just going to be needed again the next day. If you don't do the vacuum full, does the *peak* space keep increasing, or does it stabilize?Cheers,Jeff