Re: Slow queries / commits, mis-configuration or hardware issues?

Cody Caughlan <toolbag@xxxxxxxxx> · Mon, 14 Nov 2011 16:13:41 -0800

On Mon, Nov 14, 2011 at 2:57 PM, Tomas Vondra <tv@xxxxxxxx> wrote:
> On 14 Listopad 2011, 22:58, Cody Caughlan wrote:
>>> Seems reasonable, although I'd bump up the checkpoint_timeout (the 5m is
>>> usually too low).
>>
>> Ok, will do.
>
> Yes, but find out what that means and think about the possible impact
> first. It usually improves the checkpoint behaviour but increases the
> recovery time and you may need more checkpoint segments. And I doubt this
> will fix the issue you've described.
>

Ok, I understand the ramifications with increasing the checkpoint
timeout. But I will investigate more before I jump in.

>> I ran bonnie++ on a slave node, doing active streaming replication but
>> otherwise idle:
>> http://batch-files-test.s3.amazonaws.com/sql03.prod.html
>>
>> bonnie++ on the master node:
>> http://batch-files-test.s3.amazonaws.com/sql01.prod.html
>>
>> If I am reading this right, this is my first time using it, the
>> numbers dont look too good.
>
> Are those instances equal, i.e. use tha same RAID10 config etc.? It
> surprises me a bit that the slave performs much better than the master,
> for example the sequential reads are much faster (210MB/s vs. 60MB/s) and
> it handles about twice the number of seeks (345 vs. 170). But this may be
> skewed because of the workload.

Yes, these two machines are the same. sql01 is the master node and is
quite busy. Running bonnie++ on it during its normal workload spiked
I/O for the duration. sql03 is a pure slave and is quite idle, save
for receiving WAL segments.

>
>> Heres a gist of running "iostat -x 3" for about a few minutes:
>>
>> https://gist.github.com/f94d98f2ef498a522ac2
>>
>> Indeed, the %iowat and await values can spike up drastically.
>
> OK, so xvdb-xvde are individual drives and dm-0 is the RAID10 device,
> right? According to the log_checkpoint info, you're writing about 15000
> (120MB) buffers in 270s, i.e. about 440kB/s. But according to the iostat
> you're writing up to 4MB/s, so it's not just about the checkpoints.
>
> What else is going on there? How much WAL do you write?

Yes, dm-0 is the RAID10 device. The WAL config is:

wal_buffers = 16MB
checkpoint_segments = 32
max_wal_senders = 10
checkpoint_completion_target = 0.9
checkpoint_timeout = 300
wal_keep_segments = 1024

>
> Do you have iotop installed? That might give you a hint what processes are
> writing data etc.

I do have iotop and have been watching it. The only I/O users are
postgres and its backends. I dont see anything else consuming any I/O.
By eyeballing iotop, big consumers of disk writes are:

idle in transaction, SELECT, COMMIT

The first two are what I would think would be largely read operations
(certainly the SELECT) so its not clear why a SELECT consumes write
time.

Here is the output of some pg_stat_bgwriter stats from the last couple of hours:

https://gist.github.com/41ee26caca01471a9b77

One thing that I might not have made very clear earlier is that this
DB, especially a single table receives a very large number of UPDATEs.
However, it seems to be well cached, I have shared_buffers = 3584MB
and a view of pg_buffercache shows:
https://gist.github.com/53c520571290cae14613

Is it possible that we're just trying to handle too many UPDATEs and
they are all trying to hit disk all at once - causing this I/O
contention? Here is a view of pg_stat_user_tables that shows the
amount of live/dead tuples:

https://gist.github.com/5ac1ae7d11facd72913f

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance