Replaying 48 WAL files takes 80 minutes

"Albe Laurenz" <laurenz.albe@xxxxxxxxxx> · Mon, 29 Oct 2012 14:05:24 +0100

I am configuring streaming replication with hot standby
with PostgreSQL 9.1.3 on RHEL 6 (kernel 2.6.32-220.el6.x86_64).
PostgreSQL was compiled from source.

It works fine, except that starting the standby took for ever:
it took the system more than 80 minutes to replay 48 WAL files
and connect to the primary.

Can anybody think of an explanation why it takes that long?

This is decent hardware: 24 cores of AMD Opteron 6174, 128 GB RAM,
NetApp SAN attached with 8 GBit Fibrechannel (ext4 file system).
An identical system performed fine in performance tests.

Here is the log; I have edited it for readability:

2012-10-29 09:22:22.945  database system was interrupted; last known up
at 2012-10-26 01:11:59 CEST
2012-10-29 09:22:22.945  creating missing WAL directory
"pg_xlog/archive_status"
2012-10-29 09:22:22.947  entering standby mode
2012-10-29 09:22:23.434  restored log file "00000001000001D1000000C4"
from archive
2012-10-29 09:22:23.453  redo starts at 1D1/C4000020
2012-10-29 09:22:25.847  restored log file "00000001000001D1000000C5"
from archive
2012-10-29 09:22:27.457  restored log file "00000001000001D1000000C6"
from archive
2012-10-29 09:22:28.946  restored log file "00000001000001D1000000C7"
from archive
2012-10-29 09:22:30.421  restored log file "00000001000001D1000000C8"
from archive
2012-10-29 09:22:31.243  restored log file "00000001000001D1000000C9"
from archive
2012-10-29 09:22:32.194  restored log file "00000001000001D1000000CA"
from archive
2012-10-29 09:22:33.169  restored log file "00000001000001D1000000CB"
from archive
2012-10-29 09:22:33.565  restored log file "00000001000001D1000000CC"
from archive
2012-10-29 09:23:35.451  restored log file "00000001000001D1000000CD"
from archive

Everything is nice until here.
Replaying this WAL file suddenly takes 1.5 minutes instead
of mere seconds as before.

2012-10-29 09:24:54.761  restored log file "00000001000001D1000000CE"
from archive
2012-10-29 09:27:23.013  restartpoint starting: time
2012-10-29 09:28:12.200  restartpoint complete: wrote 242 buffers
(0.0%);
                         0 transaction log file(s) added, 0 removed, 0
recycled;
                         write=48.987 s, sync=0.185 s, total=49.184 s;
                         sync files=1096, longest=0.016 s, average=0.000
s
2012-10-29 09:28:12.206  recovery restart point at 1D1/CC618278
2012-10-29 09:28:31.226  restored log file "00000001000001D1000000CF"
from archive

Again there is a difference of 2.5 minutes
between these WAL files, only 50 seconds of
which were spent in the restartpoint.

>From here on it continues in quite the same vein.
Some WAL files are restored in seconds, but some take
more than 4 minutes.

I'll skip to the end of the log:

2012-10-29 10:37:53.809  restored log file "00000001000001D1000000EF"
from archive
2012-10-29 10:38:53.194  restartpoint starting: time
2012-10-29 10:39:25.929  restartpoint complete: wrote 161 buffers
(0.0%);
                         0 transaction log file(s) added, 0 removed, 0
recycled;
                         write=32.661 s, sync=0.066 s, total=32.734 s;
                         sync files=251, longest=0.003 s, average=0.000
s
2012-10-29 10:39:25.929  recovery restart point at 1D1/ED95C728
2012-10-29 10:42:56.153  restored log file "00000001000001D1000000F0"
from archive
2012-10-29 10:43:53.062  restartpoint starting: time
2012-10-29 10:45:36.871  restored log file "00000001000001D1000000F1"
from archive
2012-10-29 10:45:39.832  restartpoint complete: wrote 594 buffers
(0.0%);
                         0 transaction log file(s) added, 0 removed, 0
recycled;
                         write=106.666 s, sync=0.093 s, total=106.769 s;
                         sync files=729, longest=0.004 s, average=0.000
s
2012-10-29 10:45:39.832  recovery restart point at 1D1/EF5D4340
2012-10-29 10:46:13.602  restored log file "00000001000001D1000000F2"
from archive
2012-10-29 10:47:38.396  restored log file "00000001000001D1000000F3"
from archive
2012-10-29 10:47:38.962  streaming replication successfully connected to
primary

I'd be happy if somebody could shed light on this.

Yours,
Laurenz Albe

PS: Here is the configuration:

             name             | current_setting 
------------------------------+---------------------------
 version                      | PostgreSQL 9.1.3 on
x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20110731 (Red Hat
4.4.6-3), 64-bit
 archive_command              | gzip -1 <"%p" | tee
/POSTGRES/data/exchange/"%f".gz >/POSTGRES/data/backups/ELAK/"%f".gz
 archive_mode                 | on
 checkpoint_completion_target | 0.9
 checkpoint_segments          | 30
 client_encoding              | UTF8
 constraint_exclusion         | off
 cursor_tuple_fraction        | 1
 custom_variable_classes      | pg_stat_statements
 default_statistics_target    | 1000
 effective_cache_size         | 64GB
 hot_standby                  | on
 lc_collate                   | de_DE.UTF8
 lc_ctype                     | de_DE.UTF8
 listen_addresses             | *
 log_checkpoints              | on
 log_connections              | on
 log_destination              | csvlog
 log_directory                | /POSTGRES/data/logs/ELAK
 log_disconnections           | on
 log_filename                 | ELAK-%Y-%m-%d.log
 log_lock_waits               | on
 log_min_duration_statement   | 3s
 log_min_error_statement      | log
 log_min_messages             | log
 log_rotation_size            | 0
 log_statement                | ddl
 log_temp_files               | 0
 logging_collector            | on
 maintenance_work_mem         | 1GB
 max_connections              | 800
 max_prepared_transactions    | 800
 max_stack_depth              | 9MB
 max_standby_archive_delay    | 0
 max_standby_streaming_delay  | 0
 max_wal_senders              | 2
 pg_stat_statements.max       | 5000
 pg_stat_statements.track     | all
 port                         | 55503
 server_encoding              | UTF8
 shared_buffers               | 16GB
 shared_preload_libraries     | pg_stat_statements,passwordcheck
 ssl                          | on
 tcp_keepalives_count         | 0
 tcp_keepalives_idle          | 0
 TimeZone                     | Europe/Vienna
 wal_buffers                  | 16MB
 wal_level                    | hot_standby
 work_mem                     | 8MB
(49 rows)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance