Re: PostgreSQL 8.2.3 VACUUM Timings/Performance

"Bruce McAlister" <bruce.mcalister@xxxxxxxxxxx> · Fri, 16 Mar 2007 19:06:57 -0000

Hi All,

Okay, I'm getting a little further now. I'm about to create entries in the 
pg_autovacuum system tables. However, I'm a little confused as to how I go 
about finding out the OID value of the tables. The pg_autovacuum table 
requires the OID of the table you want to create settings for (vacrelid). 
Can anyone shed some light on how I can extract the OID of the table? Also, 
what happens if you create a table without OID's, are you still able to add 
it's details in the pg_autovacuum table if there is no OID associated with a 
table?

      Name Type References Description
      vacrelid oid pg_class.oid The table this entry is for
      enabled bool   If false, this table is never autovacuumed
      vac_base_thresh integer   Minimum number of modified tuples before 
vacuum
      vac_scale_factor float4   Multiplier for reltuples to add to 
vac_base_thresh
      anl_base_thresh integer   Minimum number of modified tuples before 
analyze
      anl_scale_factor float4   Multiplier for reltuples to add to 
anl_base_thresh
      vac_cost_delay integer   Custom vacuum_cost_delay parameter
      vac_cost_limit integer   Custom vacuum_cost_limit parameter
      freeze_min_age integer   Custom vacuum_freeze_min_age parameter
      freeze_max_age integer   Custom autovacuum_freeze_max_age parameter

Thanks
Bruce

"Bruce McAlister" <bruce.mcalister@xxxxxxxxxxx> wrote in message 
news:esrdsb$1klg$1@xxxxxxxxxxxxxxx
> Hi Tom,
>
> Thanks for the suggestion. It's been a while since I replied to this as I 
> had to go and do some further investigation of the docs with regards the 
> autovacuum daemons configuration. According to the documentation, the 
> formula's for the vacuum and analyze are as follows:
>
> Vacuum
>    vacuum threshold = vacuum base threshold + vacuum scale factor * number 
> of tuples
> Analyze
>    analyze threshold = analyze base threshold + analyze scale factor * 
> number of tuples
>
> My current settings for autovacuum are as follows:
>
> # - Cost-Based Vacuum Delay -
>
> vacuum_cost_delay         = 200                 # 0-1000 milliseconds
> vacuum_cost_page_hit    = 1                     # 0-10000 credits
> vacuum_cost_page_miss = 10                   # 0-10000 credits
> vacuum_cost_page_dirty = 20                   # 0-10000 credits
> vacuum_cost_limit           = 200                 # 0-10000 credits
>
> #---------------------------------------------------------------------------
> # AUTOVACUUM PARAMETERS
> #---------------------------------------------------------------------------
>
> autovacuum                                   = on                      # 
> enable autovacuum subprocess?
> 
> # 'on' requires stats_start_collector
> 
> # and stats_row_level to also be on
> autovacuum_naptime                     = 1min                   # time 
> between autovacuum runs
> autovacuum_vacuum_threshold     = 500                     # min # of tuple 
> updates before
> 
> # vacuum
> autovacuum_analyze_threshold      = 250                    # min # of 
> tuple updates before
> 
> # analyze
> autovacuum_vacuum_scale_factor = 0.2                     # fraction of rel 
> size before
> 
> # vacuum
> autovacuum_analyze_scale_factor = 0.1                     # fraction of 
> rel size before
> 
> # analyze
> autovacuum_freeze_max_age       = 200000000         # maximum XID age 
> before forced vacuum
> 
> # (change requires restart)
> autovacuum_vacuum_cost_delay  = -1                       # default vacuum 
> cost delay for
> 
> # autovacuum, -1 means use
> 
> # vacuum_cost_delay
> autovacuum_vacuum_cost_limit    = -1                       # default 
> vacuum cost limit for
> 
> # autovacuum, -1 means use
> 
> # vacuum_cost_limit
>
> Thus to make the autovacuum more aggressive I am thinking along the lines 
> of changing the following parameters:
>
> autovacuum_vacuum_threshold     = 250
> autovacuum_analyze_threshold     = 125
>
> The documentation also mentions that when the autovacuum runs it selects a 
> single database to process on that run. This means that the particular 
> table that we are interrested in will only be vacuumed once every 17 
> minutes, assuming we have 18 databases and the selection process is 
> sequential through the database list.
>
> From my understanding of the documentation, the only way to work around 
> this issue is to manually update the system catalog table pg_autovacuum 
> and set the pg_autovacuum.enabled field to false to skip the autovacuum on 
> tables that dont require such frequent vacuums. If I do enable this 
> feature, and I manually run a vacuumdb from the command line against that 
> particular disabled table, will the vacuum still process the table? I'm 
> assuming too, that the best tables to disable autovacuum on will be ones 
> with a minimal amount of update/delete queries run against it. For 
> example, if we have a table that only has inserts applied to it, it is 
> safe to assume that that table can safely be ignored by autovacuum.
>
> Do you have any other suggestions as to which tables generally can be 
> excluded from the autovacuum based on the usage patterns?
> Can you see anything with respect to my new autovacuum parameters that may 
> cause issue's and are there any other parameters that you suggest I need 
> to change to make the autovacuum daemon more aggressive?
>
> PS: Currently we have the Cluster command running on the sipaccounts table 
> as the vacuum full is taking too long. It would be nice though to have 
> some piece of mind that the cluster command is mvcc safe, as Heikki and 
> Aidan have mentioned that it is not and may break things in our 
> environment, I'm a little afraid of running with the cluster command, and 
> should possibly go back to the vacuum full :/
>
> Thanks all for any and all suggestions/comments.
>
> Thanks
> Bruce
>
>
> "Tom Lane" <tgl@xxxxxxxxxxxxx> wrote in message 
> news:19106.1173111068@xxxxxxxxxxxxxxxx
>> "Bruce McAlister" <bruce.mcalister@xxxxxxxxxxx> writes:
>>> [1] AutoVacuum runs during the day over the entire PostgreSQL cluster,
>>
>> Good, but evidently you need to make it more aggressive.
>>
>>> [2] A Vacuum Full Verbose is run during our least busy period (generally
>>> 03:30) against the Database,
>>
>>> [3] A Re-Index on the table is performed,
>>
>>> [4] A Cluster on the table is performed against the most used index,
>>
>>> [5] A Vacuum Analyze Verbose is run against the database.
>>
>> That is enormous overkill.  Steps 2 and 3 are a 100% waste of time if
>> you are going to cluster in step 4.  Just do the CLUSTER and then
>> ANALYZE (or VACUUM ANALYZE if you really must, but the value is 
>> marginal).
>>
>> regards, tom lane
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 3: Have you checked our extensive FAQ?
>>
>>               http://www.postgresql.org/docs/faq
>>
>
>