On 10/31/2016 08:19 PM, Andre Henry wrote:
My PG 9.4.5 server runs on Amazon RDS some times of the day we have a lot of checkpoints really close (less than 1 minute apart, see logs below) and we are trying to tune the DB to minimize the impact of the checkpoint or reduce the number of checkpoints. Server Stats · Instance Type db.r3.4xl • 16 vCPUs 122GB of RAM • PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16), 64-bit Some PG Stats • Shared Buffers = 31427608kB • Checkpoint Segments = 64 • Checkpoint completion target = .9 • Rest of the configuration is below Things we are doing • We have a huge table where each row is over 1kB and its very busy. We are splitting that into multiple tables especially the one json field that making it large. Questions • Each checkpoint log writes out the following checkpoint complete: wrote 166481 buffers (4.2%); 0 transaction log file(s) added, 0 removed, 64 recycled; write=32.441 s, sync=0.050 s, total=32.550 s; sync files=274, longest=0.049 s, average=0.000 s
OK, each checkpoint has to write all dirty data from checkpoints. You have ~170k buffers worth of dirty data, i.e. ~1.3GB.
• What does buffers mean? How do I find out how much RAM that is equivalent to?
Buffer holds 8kB of data, which is the "chunk" of data files.
• Based on my RDS stats I don't think IOPs will help, because I don't see any flat lines on my write operations / second graph. Is this a good assumption?
Not sure what you mean by this. Also, maybe you should talk to AWS if you're on RDS.
• What else can we tune to spread out checkpoints?
Based on the logs, your checkpoints are triggered by filling WAL. I see your checkpoints happen every 30 - 40 seconds, and you only have 64 segments.
So to get checkpoints checkpoints triggered by timeout (which I assume is 5 minutes, because you have not mentioned checkpoint_timeout), you need to increase checkpoint_segments enough to hold 5 minutes worth of WAL.
That means 300/30 * 64, i.e. roughly 640 segments (it's likely an overestimate, due to full page writes, but well).
regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance