Hi, we currently encounter an increasing load on our website. With the increasing load we see some problems on our database. so we checked what happens and we saw spikes in our load when checkpoints are about to finish. Our configuration: max_connections = 125 ssl = false shared_buffers = 500MB work_mem = 15MB maintenance_work_mem = 250MB synchronous_commit = off full_page_writes = off checkpoint_segments = 10 checkpoint_timeout = 10min checkpoint_completion_target = 0.9 random_page_cost = 2 effective_cache_size = 5000MB autovacuum = off (we put autovacuum to off because we suspected it). all the other parameters are set to default (beside log parameters and so on). Our machine has 12 GB RAM, i7-975 CPU and a SW-Raid-1 for datadir and another one for the rest of the server (including postgresql logfiles). Disk are "Barracuda 7200.11 SATA 3Gb/s 1.5-TB" and we are running debian lenny. these are our checkpoint log statements: 2010-06-09 17:24:27 CEST [6375]: [1-1] LOG: checkpoint starting: time 2010-06-09 17:28:09 CEST [6375]: [2-1] LOG: checkpoint complete: wrote 2861 buffers (4.5%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=193.057 s, sync=29.259 s, total=222.353 s 2010-06-09 17:34:27 CEST [6375]: [3-1] LOG: checkpoint starting: time 2010-06-09 17:39:09 CEST [6375]: [4-1] LOG: checkpoint complete: wrote 3247 buffers (5.1%); 0 transaction log file(s) added, 0 removed, 2 recycled; write=255.255 s, sync=26.911 s, total=282.177 s 2010-06-09 17:44:27 CEST [6375]: [5-1] LOG: checkpoint starting: time 2010-06-09 17:49:41 CEST [6375]: [6-1] LOG: checkpoint complete: wrote 2746 buffers (4.3%); 0 transaction log file(s) added, 0 removed, 2 recycled; write=280.743 s, sync=33.392 s, total=314.147 s 2010-06-09 17:54:27 CEST [6375]: [7-1] LOG: checkpoint starting: time 2010-06-09 17:58:59 CEST [6375]: [8-1] LOG: checkpoint complete: wrote 3118 buffers (4.9%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=253.293 s, sync=18.585 s, total=271.892 s 2010-06-09 18:04:27 CEST [6375]: [9-1] LOG: checkpoint starting: time 2010-06-09 18:08:46 CEST [6375]: [10-1] LOG: checkpoint complete: wrote 2695 buffers (4.2%); 0 transaction log file(s) added, 0 removed, 2 recycled; write=225.173 s, sync=33.789 s, total=258.972 s 2010-06-09 18:14:27 CEST [6375]: [11-1] LOG: checkpoint starting: time 2010-06-09 18:18:30 CEST [6375]: [12-1] LOG: checkpoint complete: wrote 2868 buffers (4.5%); 0 transaction log file(s) added, 0 removed, 2 recycled; write=215.561 s, sync=27.701 s, total=243.271 s What we saw is a rather long sync time. And exactly at this time, our responses become slow and the server load increases. this is from pg_stat_bgwriter: checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean | maxwritten_clean | buffers_backend | buffers_alloc -------------------+-----------------+--------------------+---------------+------------------+-----------------+--------------- 3495 | 0 | 9070242 | 10798927 | 14421 | 6412707 | 208340755 We tried to increase checkpoint_timeout to 20 minutes but it was getting worse: 2010-06-09 18:34:27 CEST [6375]: [13-1] LOG: checkpoint starting: time 2010-06-09 18:42:49 CEST [6375]: [14-1] LOG: checkpoint complete: wrote 2956 buffers (4.6%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=448.265 s, sync=54.087 s, total=502.377 s here we see a 54 sec sync time and a much higher load on sync time compared to the 10 minutes checkpoint_timeout. do you have any hints for us how to tune our configuration to avoid spikes? kind regards Janning -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general