Hi all. I’m running PostgreSQL 9.3.4 and doing stress test of the database with writing only load. The test plan does 1000 transactions per second (each of them does several updates/inserts). The problem is that checkpoint is not distributed over time well. When the checkpoint finishes, the db gets lots of I/O operations and response timings grows strongly. My checkpoint settings looks like that: postgres=# select name, setting from pg_catalog.pg_settings where name like 'checkpoint%' and boot_val != reset_val; name | setting ------------------------------+--------- checkpoint_completion_target | 0.9 checkpoint_segments | 100500 checkpoint_timeout | 600 (3 rows) postgres=# But in the log I see that checkpoint continues less than 600*0.9 = 540 seconds: 2014-04-14 12:54:41.479 MSK,,,10517,,53468da6.2915,433,,2014-04-10 16:25:10 MSK,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,"" 2014-04-14 12:57:06.107 MSK,,,10517,,53468da6.2915,434,,2014-04-10 16:25:10 MSK,,0,LOG,00000,"checkpoint complete: wrote 65140 buffers (24.8%); 0 transaction log file(s) added, 0 removed, 327 recycled; write=134.217 s, sync=10.292 s, total=144.627 s; sync files=31, longest=3.332 s, average=0.331 s",,,,,,,,,»" When the checkpoint starts (12:54:41.479) dstat says that I/O load increases: ----system---- -dsk/total- --io/total- date/time | read writ| read writ 14-04 12:54:39| 0 15M| 0 2562 14-04 12:54:40| 0 13M| 0 2330 14-04 12:54:41| 0 97M| 0 5981 14-04 12:54:42| 0 95M| 0 8869 14-04 12:54:43| 0 147M| 0 8493 14-04 12:54:44| 0 144M| 0 8316 14-04 12:54:45| 0 176M| 0 8189 14-04 12:54:46| 0 141M| 0 8221 14-04 12:54:47| 0 143M| 0 8260 14-04 12:54:48| 0 141M| 0 7576 14-04 12:54:49| 0 173M| 0 8171 But when it finishes (12:57:06.107) the I/O load is much higher than the hardware can do: ----system---- -dsk/total- --io/total- date/time | read writ| read writ 14-04 12:56:52| 0 33M| 0 5185 14-04 12:56:53| 0 64M| 0 5271 14-04 12:56:54| 0 65M| 0 5256 14-04 12:56:55| 0 153M| 0 15.8k 14-04 12:56:56| 0 758M| 0 18.6k 14-04 12:56:57| 0 823M| 0 4164 14-04 12:56:58| 0 843M| 0 8186 14-04 12:56:59| 0 794M| 0 15.0k 14-04 12:57:00| 0 880M| 0 5954 14-04 12:57:01| 0 862M| 0 4608 14-04 12:57:02| 0 804M| 0 7387 14-04 12:57:03| 0 849M| 0 4878 14-04 12:57:04| 0 788M| 0 20.0k 14-04 12:57:05| 0 805M| 0 6004 14-04 12:57:06| 0 143M| 0 6932 14-04 12:57:07| 0 108M| 0 6150 14-04 12:57:08| 0 42M| 0 6233 14-04 12:57:09| 0 73M| 0 6248 Response timings of the application at this moment look like that: The hardware is quite good to handle this load (PGDATA lives on soft raid10 array of 8 ssd drives). I’ve done the same test with 3000 tps - the result was exactly the same. The only difference was that I/O spikes had been stronger. So my question is why the checkpoint is not spread for 540 seconds? Is there any way to understand why I/O spike happens when the checkpoint finishes but does not happen during all of the checkpoint process? Any help would be really appropriate. -- Vladimir |