The problem is that while this makes the checkpoints less frequent, it
accumulates more changes that need to be written to disk during the
checkpoint. Which means the impact more severe.
True. But the checkpoints finish in approximately 5-10 minutes every time (even with checkpoint_completion_target of 0.9).
Here are the checkpoint logs for April 26: http://pastebin.com/Sh7bZ8u8
The only case when this is not true is when repeatedly modifying a
subset of the data (say, a few data blocks), because the changes merge
into a single write during checkpoint.
In my case most of the time I would be updating the same rows every 5 minutes for the entire clock hour. Every 5 minutes I would be updating the same set of rows (adding new data to the data[] field) in the tables. But of course checkpoints don't happen exactly at the 1 hour mark on the clock. But I think my system could take advantage of the behavior you explain above.
$ cat /proc/meminfo | grep Dirty
Here is a log of Dirty info every 1 second during checkpoint: http://pastebin.com/gmJFFAKW
The top stats are while the checkpoint alone was running. Checkpoint was running between 22:12:09-22:21:35. Around 22:17:30- 22:24:21 is when the write queries were running.
Also, watch "iostat -x -k 1" or something similar to see disk activity.
Here is a log of iostat every 1 second during checkpoint: http://pastebin.com/RHyMkiQt
This is stats for the timeframe corresponding to the dirty logs above. But because of 500k size constraint in pastebin I had to delete a lot of stats at the beginning. The top stats are while the checkpoint alone was running. The write queries started running in the middle and at the very end it's the stats of just the write queries (checkpoint completed before the write queries completed).
So how much data in total are we talking about?
OK, so there are multiple tables, and you're updating 50k rows in all
tables in total?
Every 5 minutes: 50K rows are updated in 4 tables. 2K rows are updated in 39 tables.
Every 1 hour (on top of the hour): 50K rows are updated in 8 tables. 2K rows are updated in 78 tables.
If every update will take up space equivalent to 1 row, then there are 278K rows updated across all tables every 5 minutes. And 556K (278 * 2) rows updated across all tables every 1 hour. All tables follow the same schema except some tables don't have the 'port' field. And the data[] column on each row could have maximum 48 values.
Can you post \dt+ and \di+ so that we get an idea of table/index sizes?
Thanks for your input on ext3 filesystem and having WAL on a different disk. I'll see if these can be changed. I cannot change these in the short term.
On Sun, Apr 27, 2014 at 3:46 PM, Tomas Vondra <tv@xxxxxxxx> wrote:
On 27.4.2014 23:01, Elanchezhiyan Elango wrote:> *Problem description:*
> (I am resending this question after waiting for several hours because
> my previous mail got stalled probably because I didn't confirm my
> email address after subscribing. So resending the mail. Sorry if this
> is causing a double post.)
>
> After a few days of running in my test environment, a query timed outThe problem is that while this makes the checkpoints less frequent, it
> (query timeout=4mins). Also in general the queries were taking a lot
> longer than expected. The workload in my database is a write intensive
> workload. And the writes happen in a burst every 5 minutes. There are a
> whole bunch of insert and update queries that run every 5 minutes. When
> I analyzed the situation (by enabling more postgres logs), I noticed
> that postgres checkpoints were triggering approximately every 5 minutes
> and based on my online research I suspected the i/o overhead of
> checkpoints was affecting the query performance. The checkpoint related
> settings were:
> checkpoint_segments = 30
> checkpoint_timeout = 15min
>
> I modified these settings to the following:
> checkpoint_segments = 250
> checkpoint_timeout = 1h
> checkpoint_completion_target = 0.9
accumulates more changes that need to be written to disk during the
checkpoint. Which means the impact more severe.
The only case when this is not true is when repeatedly modifying a
subset of the data (say, a few data blocks), because the changes merge
into a single write during checkpoint.
So, can you share a few of the checkpoint log messages? So that we get
> After I tweaked these settings, checkpoints were happening only once in
> an hour and that improved the query performance. However, when the
> checkpoint happens every hour, the query performance is still very poor.
> This is still undesirable to my system.
an idea of how much data needs to be synced to disk.
Can you monitor the amount of dirty data in page cache, i.e. data that
> I also tried editing dirty_background_ratio and dirty_expire_centisecs
> in /etc/sysctl.conf. All dirty related kernel settings:
>
>># sysctl -a | grep dirty
>
> vm.dirty_background_ratio = 1
> vm.dirty_background_bytes = 0
> vm.dirty_ratio = 20
> vm.dirty_bytes = 0
> vm.dirty_writeback_centisecs = 500
> vm.dirty_expire_centisecs = 500
>
> This also didn't improve the situation.
needs to be written to disk? Wait for the checkpoint and sample the
/proc/meminfo a few times:
$ cat /proc/meminfo | grep Dirty
Also, watch "iostat -x -k 1" or something similar to see disk activity.
> * *EXPLAIN ANALYZE:*
> My question is how to track down the reason for the poor performance
> during checkpoints and improve the query performance when the
> checkpoints happen?
>
> o http://explain.depesz.com/s/BNva - An insert query inserting
> just 129 rows takes 20 seconds.> o http://explain.depesz.com/s/5hA - An update query updating 43926
> rows takes 55 seconds.
> * *History:* It gets slower after a few days of the system running.
>
> *Table Metadata*:
>
> * The tables get updated every 5 minutes. Utmost 50000 rows in a table
> get updated every 5 minutes. About 50000 rows get inserted every 1 hour.> * There are 90 tables in the DB. 43 of these are updated every 5
> minutes. 8/90 tables receive a high update traffic of 50000So how much data in total are we talking about?
> updates/5mins. Remaining tables receive an update traffic of 2000
> updates/5min. 43/90 tables are updated every 1 hour.
> *PostgreSQL version: *PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu,
> compiled by gcc (GCC) 4.6.x-google 20120601 (prerelease), 64-bit> *How you installed PostgreSQL: *Compiled from source and installed.
>
>
> *Changes made to the settings in the postgresql.conf file:*
Seems fine to me, except for the following changes:
> name | current_setting | source
> ------------------------------+------------------------+----------------------
> maintenance_work_mem | 64MB | configuration file
> temp_buffers | 256MB | configuration file
> wal_buffers | 1MB | configuration fileAny particular reasons for setting work_mem > maintenance_work_mem? Why
> work_mem | 128MB | configuration file
have you modified wal_buffer and temp_buffers?
I doubt these are related to the issues you're seeing, though.
> * *Relevant Schema*: All tables referenced in this question have this
> same schemaOK, so there are multiple tables, and you're updating 50k rows in all
>
> managed_target_stats=> \d stat_300_3_1
>
> Table "public.stat_300_40110_1"
>
> Column | Type | Modifiers
> --------+---------+-----------
> ts | integer |
> target | bigint |
> port | integer |
> data | real[] |
>
> Indexes:
> "unique_stat_300_40110_1" UNIQUE CONSTRAINT, btree (ts, target, port)
> "idx_port_stat_300_40110_1" btree (port)
> "idx_target_stat_300_40110_1" btree (target)
> "idx_ts_stat_300_40110_1" btree (ts)
tables in total? Can you post \dt+ and \di+ so that we get an idea of
table/index sizes?
> * *Hardware*:
> o CPU: Intel(R) Xeon(R) CPU E5205 @ 1.86GHz
> o Memory: 6GB
> o Storage Details:
>> *Number Start End Size Type File system Flags*
> There are 2 500GB disks (/dev/sda, /dev/sdb) with the following 6
> partitions on each disk.
>
>The first problem here is ext3. It's behavior when performing fsync is
> 1 512B 24.7MB 24.7MB primary boot
> 2 24.7MB 6473MB 6449MB primary linux-swap(v1)
> 3 6473MB 40.8GB 34.4GB primary ext3
> 4 40.8GB 500GB 459GB extended lba
> 5 40.8GB 408GB 367GB logical ext3
> 6 408GB 472GB 64.4GB logical ext3
really terrible. See
http://blog.2ndquadrant.com/linux_filesystems_and_postgres/
for more details. So, the first thing you should do is switching to ext4
or xfs.
> *Disk model and details:*
>Regular 7.2k SATA disk, not the most powerful piece of hardware.
> Model Family: Western Digital RE3 Serial ATA family
So essentially RAID0, as you only have 2 drives.
> The postgres data is stored on a software RAID10 on partition 5 of
> both these disks.
> * *Maintenance Setup*: autovacuum is running with default settings.
> Old records are deleted every night. I also do 'vacuum full' on a 12My bet is it's related. If the system is I/O bound, it's natural the
> tables that receive large number of updates every night at 1AM. I
> have noticed that these 'vacuum full' also time out. (I am planning
> to post a separate question regarding my vacuuming strategy).
vacuum full are performing badly too.
> * *WAL Configuration*: The WAL is in the same disk.
Which is not helping, because it interferes with the other I/O.
regards
Tomas
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance