Thanks for your response. Please see below for answers to your questions. On Mon, Nov 14, 2011 at 11:22 AM, Tomas Vondra <tv@xxxxxxxx> wrote: > On 14 Listopad 2011, 19:16, Cody Caughlan wrote: >> shared_buffers = 3584MB >> wal_buffers = 16MB >> checkpoint_segments = 32 >> max_wal_senders = 10 >> checkpoint_completion_target = 0.9 >> wal_keep_segments = 1024 >> maintenance_work_mem = 256MB >> work_mem = 88MB >> shared_buffers = 3584MB >> effective_cache_size = 10GB > > Seems reasonable, although I'd bump up the checkpoint_timeout (the 5m is > usually too low). Ok, will do. > >> The PGDATA dir is a RAID10 on 4 local ("ephemeral" in EC2 speak) >> drives. I ran some dd tests and received the following output: >> >> --- WRITING --- >> root@sql03:/data# time sh -c "dd if=/dev/zero of=/data/tmp/bigfile >> bs=8k count=4000000 && sync" >> 4000000+0 records in >> 4000000+0 records out >> 32768000000 bytes (33 GB) copied, 670.663 s, 48.9 MB/s >> >> real 11m52.199s >> user 0m2.720s >> sys 0m45.330s > > This measures sequential write performance (and the same holds for the > read test). We need to know the random I/O performance too - use bonnie++ > or similar tool. > > Based on the AWS benchmarks I've seen so far, I'd expect about 90 MB/s for > sequential read/writes, and about twice that performance for a 4-drive > RAID10. So while the reads (211 MB/s) seem perfectly OK, the writes > (50MB/s) are rather slow. Have you measured this on an idle system, or > when the db was running? > I ran bonnie++ on a slave node, doing active streaming replication but otherwise idle: http://batch-files-test.s3.amazonaws.com/sql03.prod.html bonnie++ on the master node: http://batch-files-test.s3.amazonaws.com/sql01.prod.html If I am reading this right, this is my first time using it, the numbers dont look too good. > See for example this: > > [1] http://victortrac.com/EC2_Ephemeral_Disks_vs_EBS_Volumes > [2] > http://www.gabrielweinberg.com/blog/2011/05/raid0-ephemeral-storage-on-aws-ec2.html > >> I have enabled log_checkpoints and here is a recent sample from the log: >> 2011-11-14 17:39:15 UTC pid:3965 (-0) LOG: checkpoint starting: time >> 2011-11-14 17:43:49 UTC pid:3965 (-0) LOG: checkpoint complete: wrote >> 16462 buffers (3.6%); 0 transaction log file(s) added, 0 removed, 9 >> recycled; write=269.978 s, sync=4.106 s, total=274.117 s; sync >> files=82, longest=2.943 s, average=0.050 s > > Nothing special here - this just says that the checkpoints were timed and > finished on time (the default checkpoint timeout is 5 minutes, with > completion target 0.9 the expected checkpoint time is about 270s). Not a > checkpoint issue, probably. > >> I've been collecting random samples from pg_stat_bgwriter: >> https://gist.github.com/4faec2ca9a79ede281e1 > > Although it's a bit difficult to interpret this (collect the data in > regular intervals - e.g. every hour - and post the differences, please), > but it seems reasonable. Ok, I have a cron running every hour to grab this data. I will post back in a few hours or tomorrow. > >> So given all this information (if you need more just let me know), is >> there something fundamentally wrong or mis-configured? Do I have an >> I/O issue? > > Probably - the discrepancy between read/write performance is a bit > suspicious. > > Try to watch the I/O performance when this happens, i.e. run "iostat -x" > and watch the output (especially %util, r_await, w_await) and post several > lines of the output. > Heres a gist of running "iostat -x 3" for about a few minutes: https://gist.github.com/f94d98f2ef498a522ac2 Indeed, the %iowat and await values can spike up drastically. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance