PostgreSQL 8.2.3 VACUUM Timings/Performance

"Bruce McAlister" <bruce.mcalister@xxxxxxxxxxx> · Mon, 5 Mar 2007 12:33:18 -0000

Hi All,

I tried posting this last week but it has not come through yet, so please 
excuse me if there is a double post.

We're having some issue's with the vacuum times within our database 
environment, and would like some input from the guru's out there that could 
potentially suggest a better approach or mechanism.

------------------------------------------------------------------------------------

Problem Background

First off, I'm no PostgreSQL guru, so, please be nice :)

Over time we have noticed increased response times from the database which 
has an adverse affect on our registration times. After doing some research 
it appears that this may have been related to our maintenance regime, and 
has thus been amended as follows:

[1] AutoVacuum runs during the day over the entire PostgreSQL cluster,

[2] A Vacuum Full Verbose is run during our least busy period (generally 
03:30) against the Database,

[3] A Re-Index on the table is performed,

[4] A Cluster on the table is performed against the most used index,

[5] A Vacuum Analyze Verbose is run against the database.

These maintenance steps have been setup to run every 24 hours.

The database in essence, once loaded up and re-index is generally around 
17MB for data and 4.7MB for indexes in size.

Over a period of 24 hours the database can grow up to around 250MB and the 
indexes around 33MB (Worst case thus far). When the maintenance kicks in, 
the vacuum full verbose step can take up to 15 minutes to complete (worst 
case). The re-index, cluster and vacuum analyze verbose steps complete in 
under 1 second each. The problem here is the vacuum full verbose, which 
renders the environment unusable during the vacuum phase. The idea here is 
to try and get the vacuum full verbose step to complete in less than a 
minute. Ideally, if we could get it to complete quicker then that would be 
GREAT, but our minimal requirement is for it to complete at the very most 1 
minute. Looking at the specifications of our environment below, do you think 
that this is at all possible?

------------------------------------------------------------------------------------

Environment Background:

We are running a VoIP service whereby the IP phones perform a registration 
request every 113 seconds. These registration requests are verified against 
the database and the details are updated accordingly. Currently we average 
around 100 - 150 read/write requests per second to this particular database. 
The requirement here is that the database response is sub 15 milliseconds 
for both types of requests, which it currently is. The database _must_ also 
be available 24x7.

------------------------------------------------------------------------------------

Hardware Environment:

SunFire X4200

      2 x Dual Core Opteron 280's

      8GB RAM

      2 x Q-Logic Fibre Channel HBA's

Sun StorEdge 3511 FC SATA Array

      1 x 1GB RAID Module

      12 x 250GB 7200 RPM SATA disks

------------------------------------------------------------------------------------

RAID Environment:

      5 Logical drives, each LD is made up of 2 x 250GB SATA HDD in a RAID 1 
mirror.

      2 x 250GB SATA HDD allocated as hot spares

The logical drives are partitioned and presented to the OS as follows:

      LD0 (2 x 250GB SATA HDD's RAID 1)

        Partition 0 (120GB)

        Partition 1 (120GB)

      LD1 (2 x 250GB SATA HDD's RAID 1)

        Partition 0 (120GB)

        Partition 1 (120GB)

      LD2 (2 x 250GB SATA HDD's RAID 1)

        Partition 0 (80GB)

        Partition 1 (80GB)

        Partition 2 (80GB)

      LD3 (2 x 250GB SATA HDD's RAID 1)

        Partition 0 (80GB)

        Partition 1 (80GB)

        Partition 2 (80GB)

      LD4 (2 x 250GB SATA HDD's RAID 1)

        Partition 0 (120GB)

        Partition 1 (120GB)

-------------------------------------------------------------------------------------

OS Environment

Solaris 10 Update 3 (11/06)

      Boot disks are 76GB 15000 RPM configure in a RAID 1 mirror.

-------------------------------------------------------------------------------------

Filesystem Layout

      PostgreSQL Data

        250GB ZFS file-system made up of:

            LD0 Partition 0 Mirrored to LD1 Partition 0 (120GB)

            LD0 Partition 1 Mirrored to LD1 Partition 1 (120GB)

        The above 2 vdevs are then striped across each other

      PostgreSQL WAL

        80GB ZFS filesystem made up of:

            LD2 Partition 0 Mirrored to LD3 Partition 0 (80GB)

            LD2 partition 1 Mirrored to LD3 Partition 1 (80GB)

        The above 2 vdevs are then striped across each other

-------------------------------------------------------------------------------------

PostgreSQL Configuration

      PostgreSQL version 8.2.3

#---------------------------------------------------------------------------

# RESOURCE USAGE (except WAL)

#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 1024MB                 # min 128kB or max_connections*16kB

                                        # (change requires restart)

temp_buffers = 8MB                      # min 800kB

max_prepared_transactions = 200         # can be 0 or more

                                        # (change requires restart) # Note: 
increasing max_prepared_transactions costs ~600 bytes of shared memory # per 
transaction slot, plus lock space (see max_locks_per_transaction).

work_mem = 1MB                          # min 64kB

maintenance_work_mem = 256MB            # min 1MB

max_stack_depth = 2MB                   # min 100kB

# - Free Space Map -

max_fsm_pages = 2048000                 # min max_fsm_relations*16, 6 bytes 
each

                                        # (change requires restart)

max_fsm_relations = 10000               # min 100, ~70 bytes each

                                        # (change requires restart)

# - Kernel Resource Usage -

#max_files_per_process = 1000           # min 25

                                        # (change requires restart)

#shared_preload_libraries = ''          # (change requires restart)

# - Cost-Based Vacuum Delay -

vacuum_cost_delay = 200                 # 0-1000 milliseconds

vacuum_cost_page_hit = 1                # 0-10000 credits

vacuum_cost_page_miss = 10              # 0-10000 credits

vacuum_cost_page_dirty = 20             # 0-10000 credits

vacuum_cost_limit = 200                 # 0-10000 credits

# - Background writer -

#bgwriter_delay = 200ms                 # 10-10000ms between rounds

#bgwriter_lru_percent = 1.0             # 0-100% of LRU buffers 
scanned/round

#bgwriter_lru_maxpages = 5              # 0-1000 buffers max written/round

#bgwriter_all_percent = 0.333           # 0-100% of all buffers 
scanned/round

#bgwriter_all_maxpages = 5              # 0-1000 buffers max written/round

#---------------------------------------------------------------------------

# WRITE AHEAD LOG

#---------------------------------------------------------------------------

# - Settings -

fsync = on                              # turns forced synchronization on or 
off

wal_sync_method = open_datasync         # the default is the first option

                                        # supported by the operating system:

                                        #   open_datasync

                                        #   fdatasync

                                        #   fsync

                                        #   fsync_writethrough

                                        #   open_sync

#full_page_writes = on                  # recover from partial page writes

wal_buffers = 512kB                     # min 32kB

                                        # (change requires restart)

commit_delay = 10000                    # range 0-100000, in microseconds

commit_siblings = 50                    # range 1-1000

# - Checkpoints -

checkpoint_segments = 128               # in logfile segments, min 1, 16MB 
each

checkpoint_timeout = 5min               # range 30s-1h

checkpoint_warning = 5min               # 0 is off

#---------------------------------------------------------------------------

# QUERY TUNING

#---------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_bitmapscan = on

#enable_hashagg = on

#enable_hashjoin = on

#enable_indexscan = on

#enable_mergejoin = on

#enable_nestloop = on

#enable_seqscan = on

#enable_sort = on

#enable_tidscan = on

# - Planner Cost Constants -

#seq_page_cost = 1.0                    # measured on an arbitrary scale

random_page_cost = 2.0                  # same scale as above

#cpu_tuple_cost = 0.01                  # same scale as above

#cpu_index_tuple_cost = 0.005           # same scale as above

#cpu_operator_cost = 0.0025             # same scale as above

effective_cache_size = 5120MB

# - Genetic Query Optimizer -

#geqo = on

#geqo_threshold = 12

#geqo_effort = 5                        # range 1-10

#geqo_pool_size = 0                     # selects default based on effort

#geqo_generations = 0                   # selects default based on effort

#geqo_selection_bias = 2.0              # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10         # range 1-1000

#constraint_exclusion = off

#from_collapse_limit = 8

#join_collapse_limit = 8                # 1 disables collapsing of explicit

                                        # JOINs

#---------------------------------------------------------------------------

# RUNTIME STATISTICS

#---------------------------------------------------------------------------

# - Query/Index Statistics Collector -

stats_command_string = on

update_process_title = on

stats_start_collector = on              # needed for block or row stats

                                        # (change requires restart) 
stats_block_level = on stats_row_level = on

stats_reset_on_server_start = off       # (change requires restart)

# - Statistics Monitoring -

#log_parser_stats = off

#log_planner_stats = off

#log_executor_stats = off

#log_statement_stats = off

#---------------------------------------------------------------------------

# AUTOVACUUM PARAMETERS

#---------------------------------------------------------------------------

autovacuum = on                         # enable autovacuum subprocess?

                                        # 'on' requires 
stats_start_collector

                                        # and stats_row_level to also be on

autovacuum_naptime = 1min               # time between autovacuum runs

autovacuum_vacuum_threshold = 500       # min # of tuple updates before

                                        # vacuum

autovacuum_analyze_threshold = 250      # min # of tuple updates before

                                        # analyze

autovacuum_vacuum_scale_factor = 0.2    # fraction of rel size before

                                        # vacuum

autovacuum_analyze_scale_factor = 0.1   # fraction of rel size before

                                        # analyze

autovacuum_freeze_max_age = 200000000   # maximum XID age before forced 
vacuum

                                        # (change requires restart)

autovacuum_vacuum_cost_delay = -1       # default vacuum cost delay for

                                        # autovacuum, -1 means use

                                        # vacuum_cost_delay

autovacuum_vacuum_cost_limit = -1       # default vacuum cost limit for

                                        # autovacuum, -1 means use

                                        # vacuum_cost_limit

------------------------------------------------------------------------------------

Database Environment

There are 18 databases running within this PostgreSQL cluster, the database 
we are having the vacuum issue with is described as follows:

                  List of relations

 Schema |        Name         |   Type   |   Owner

--------+---------------------+----------+------------

 public | dialplans           | table    |

 public | extensions          | table    |

 public | extensions_id_seq   | sequence |

 public | iaxaccounts         | table    |

 public | numbercategories    | table    |

 public | numbers             | table    |

 public | numbersubcategories | table    |

 public | portnumbers         | table    |

 public | portrequests        | table    |

 public | sipaccounts         | table    |

 public | voicemailaccounts   | table    |

The table (with indexes) being used the most is described as follows:

                                  Table "public.sipaccounts"

       Column       |           Type           |                  Modifiers

--------------------+--------------------------+------------------------

--------------------+--------------------------+----------------------

 id                 | character varying(36)    | not null

 name               | character varying(80)    | not null

 accountcode        | character varying(20)    |

 amaflags           | character varying(13)    |

 callgroup          | character varying(10)    |

 callerid           | character varying(80)    |

 canreinvite        | character(3)             | default 'no'::bpchar

 context            | character varying(80)    |

 defaultip          | character varying(15)    |

 dtmfmode           | character varying(7)     |

 fromuser           | character varying(80)    |

 fromdomain         | character varying(80)    |

 fullcontact        | character varying(80)    |

 host               | character varying(31)    | not null default 
''::character varying

 insecure           | character varying(11)    |

 language           | character(2)             |

 mailbox            | character varying(50)    |

 md5secret          | character varying(80)    |

 nat                | character varying(5)     | not null default 
'no'::character varying

 deny               | character varying(95)    |

 permit             | character varying(95)    |

 mask               | character varying(95)    |

 pickupgroup        | character varying(10)    |

 port               | character varying(5)     | not null default 
''::character varying

 qualify            | character(4)             |

 restrictcid        | character(1)             |

 rtptimeout         | character(3)             |

 rtpholdtimeout     | character(5)             |

 secret             | character varying(80)    |

 type               | character varying(6)     | not null default 
'friend'::character varying

 username           | character varying(80)    | not null default 
''::character varying

 disallow           | character varying(100)   |

 allow              | character varying(100)   |

 musiconhold        | character varying(100)   |

 regseconds         | integer                  | not null default 0

 ipaddr             | character varying(15)    | not null default 
''::character varying

 regexten           | character varying(80)    | not null default 
''::character varying

 cancallforward     | character(3)             | default 'yes'::bpchar

 setvar             | character varying(100)   | not null default 
''::character varying

 inserted           | timestamp with time zone | not null default now()

 lastregister       | timestamp with time zone |

 useragent          | character varying(128)   |

 natsendkeepalives  | character(1)             | default 'n'::bpchar

 natconnectionstate | character(1)             |

 outboundproxyport  | character varying(5)     | not null default 
''::character varying

 outboundproxy      | character varying(31)    | not null default 
''::character varying

 voicemailextension | character varying(128)   |

 pstncallerid       | character varying(24)    | default 'Uknown'::character 
varying

 dialplan           | character varying(64)    |

 whitelabelid       | character varying(32)    |

 localcallprefix    | character varying(10)    |

Indexes:

    "sippeers_pkey" PRIMARY KEY, btree (id), tablespace "bf_service_idx"

    "sippeers_name_key" UNIQUE, btree (name) CLUSTER, tablespace 
"bf_service_idx"

    "accountcode_index" btree (accountcode), tablespace "bf_service_idx"

If anyone has any comments/suggestions please feel free to respond. Any 
responses are most welcome.

Thanks

Bruce