On Oct 3, 2013, at 23:56, Michal TOMA <mt@xxxxxxxxxx> wrote: > I have a problem on my pg 9.2.4 setup (OpenSuse 12.2, kernel 3.2.13). > My pg_xlog directory is growing uncontrolably untill it fills the partition. The database is under heavy write load and is spread on two tablesapces one on a ssd software raid1 partition and a second one on a hdd software raid1 partition. > I have no wal archiving enabled nor any replication. > > I have tried different checkpoint related parameters without any noticable improvement. > Now I have: > checkpoint_completion_target = 0.9 > wal_buffers = 8MB > checkpoint_segments = 16 > checkpoint_timeout = 20min > shared_buffers = 2GB > log_checkpoints = on > > This is what I can see in the log: > 2013-10-03 13:58:56 CEST LOG: checkpoint starting: xlog > 2013-10-03 13:59:56 CEST LOG: checkpoint complete: wrote 448 buffers (0.2%); 0 transaction log file(s) added, 9 removed, 18 recycled; write=39.144 s, sync=21.136 s, total=60.286 s; sync files=380, longest=14.517 s, average=0.055 s > 2013-10-03 14:04:07 CEST LOG: checkpoint starting: xlog > 2013-10-03 15:27:01 CEST LOG: checkpoint complete: wrote 693 buffers (0.3%); 0 transaction log file(s) added, 0 removed, 16 recycled; write=90.775 s, sync=4883.295 s, total=4974.074 s; sync files=531, longest=152.855 s, average=9.196 s > 2013-10-03 15:27:01 CEST LOG: checkpoint starting: xlog time > 2013-10-03 19:06:30 CEST LOG: checkpoint complete: wrote 3467 buffers (1.3%); 0 transaction log file(s) added, 0 removed, 16 recycled; write=122.555 s, sync=13046.077 s, total=13168.637 s; sync files=650, longest=234.697 s, average=20.069 s > 2013-10-03 19:06:30 CEST LOG: checkpoint starting: xlog time > 2013-10-03 22:30:25 CEST LOG: checkpoint complete: wrote 10198 buffers (3.9%); 0 transaction log file(s) added, 216 removed, 33 recycled; write=132.229 s, sync=12102.311 s, total=12234.608 s; sync files=667, longest=181.374 s, average=18.144 s > 2013-10-03 22:30:25 CEST LOG: checkpoint starting: xlog time I'm not too familiar with checkpoint logging output, but from the looks of it you're literally spending hours on syncing checkpoints. Are those disks on a RAID controller with a failed cache battery or something? You aren't using RAID-5, are you? > ------------------------------------------------------------------------------------------------------------- > > When the server is up and running under the usual load I get the following results: > ------------------------------------------------------------------------------------------------------------- > 2 seconds per test > O_DIRECT supported on this platform for open_datasync and open_sync. > > Compare file sync methods using one 8kB write: > (in wal_sync_method preference order, except fdatasync > is Linux's default) > open_datasync 0.369 ops/sec > fdatasync 0.575 ops/sec > fsync 0.125 ops/sec > fsync_writethrough n/a > open_sync 0.222 ops/sec > > Compare file sync methods using two 8kB writes: > (in wal_sync_method preference order, except fdatasync > is Linux's default) > open_datasync 0.383 ops/sec > fdatasync 2.171 ops/sec > fsync 1.318 ops/sec > fsync_writethrough n/a > open_sync 0.929 ops/sec > > Compare open_sync with different write sizes: > (This is designed to compare the cost of writing 16kB > in different write open_sync sizes.) > 1 * 16kB open_sync write 0.079 ops/sec > 2 * 8kB open_sync writes 0.041 ops/sec > 4 * 4kB open_sync writes 0.194 ops/sec > 8 * 2kB open_sync writes 0.013 ops/sec > 16 * 1kB open_sync writes 0.005 ops/sec > > Test if fsync on non-write file descriptor is honored: > (If the times are similar, fsync() can sync data written > on a different descriptor.) > write, fsync, close 0.098 ops/sec > write, close, fsync 0.067 ops/sec > > Non-Sync'ed 8kB writes: > write 0.102 ops/sec > ------------------------------------------------------------------------------------------------------------- Those numbers look bad. Are these the SSD's or the software RAID? It's almost as if you're saturating your disk I/O bandwidth. What hardware is involved here? Or is it a kernel limitation, perhaps? > I need to tell to the server to limit the amount of wal files in pg_xlog somehow whatever the efect on the performance could be. I think more's at play here. Unfortunately, if it's not directly related to the things I mentioned I can't help much. I'm a bit out of my league here though - I already made lots of assumptions about how to interpret this data. Cheers, Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general