I am configuring streaming replication with hot standby with PostgreSQL 9.1.3 on RHEL 6 (kernel 2.6.32-220.el6.x86_64). PostgreSQL was compiled from source. It works fine, except that starting the standby took for ever: it took the system more than 80 minutes to replay 48 WAL files and connect to the primary. Can anybody think of an explanation why it takes that long? This is decent hardware: 24 cores of AMD Opteron 6174, 128 GB RAM, NetApp SAN attached with 8 GBit Fibrechannel (ext4 file system). An identical system performed fine in performance tests. Here is the log; I have edited it for readability: 2012-10-29 09:22:22.945 database system was interrupted; last known up at 2012-10-26 01:11:59 CEST 2012-10-29 09:22:22.945 creating missing WAL directory "pg_xlog/archive_status" 2012-10-29 09:22:22.947 entering standby mode 2012-10-29 09:22:23.434 restored log file "00000001000001D1000000C4" from archive 2012-10-29 09:22:23.453 redo starts at 1D1/C4000020 2012-10-29 09:22:25.847 restored log file "00000001000001D1000000C5" from archive 2012-10-29 09:22:27.457 restored log file "00000001000001D1000000C6" from archive 2012-10-29 09:22:28.946 restored log file "00000001000001D1000000C7" from archive 2012-10-29 09:22:30.421 restored log file "00000001000001D1000000C8" from archive 2012-10-29 09:22:31.243 restored log file "00000001000001D1000000C9" from archive 2012-10-29 09:22:32.194 restored log file "00000001000001D1000000CA" from archive 2012-10-29 09:22:33.169 restored log file "00000001000001D1000000CB" from archive 2012-10-29 09:22:33.565 restored log file "00000001000001D1000000CC" from archive 2012-10-29 09:23:35.451 restored log file "00000001000001D1000000CD" from archive Everything is nice until here. Replaying this WAL file suddenly takes 1.5 minutes instead of mere seconds as before. 2012-10-29 09:24:54.761 restored log file "00000001000001D1000000CE" from archive 2012-10-29 09:27:23.013 restartpoint starting: time 2012-10-29 09:28:12.200 restartpoint complete: wrote 242 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=48.987 s, sync=0.185 s, total=49.184 s; sync files=1096, longest=0.016 s, average=0.000 s 2012-10-29 09:28:12.206 recovery restart point at 1D1/CC618278 2012-10-29 09:28:31.226 restored log file "00000001000001D1000000CF" from archive Again there is a difference of 2.5 minutes between these WAL files, only 50 seconds of which were spent in the restartpoint. >From here on it continues in quite the same vein. Some WAL files are restored in seconds, but some take more than 4 minutes. I'll skip to the end of the log: 2012-10-29 10:37:53.809 restored log file "00000001000001D1000000EF" from archive 2012-10-29 10:38:53.194 restartpoint starting: time 2012-10-29 10:39:25.929 restartpoint complete: wrote 161 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=32.661 s, sync=0.066 s, total=32.734 s; sync files=251, longest=0.003 s, average=0.000 s 2012-10-29 10:39:25.929 recovery restart point at 1D1/ED95C728 2012-10-29 10:42:56.153 restored log file "00000001000001D1000000F0" from archive 2012-10-29 10:43:53.062 restartpoint starting: time 2012-10-29 10:45:36.871 restored log file "00000001000001D1000000F1" from archive 2012-10-29 10:45:39.832 restartpoint complete: wrote 594 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=106.666 s, sync=0.093 s, total=106.769 s; sync files=729, longest=0.004 s, average=0.000 s 2012-10-29 10:45:39.832 recovery restart point at 1D1/EF5D4340 2012-10-29 10:46:13.602 restored log file "00000001000001D1000000F2" from archive 2012-10-29 10:47:38.396 restored log file "00000001000001D1000000F3" from archive 2012-10-29 10:47:38.962 streaming replication successfully connected to primary I'd be happy if somebody could shed light on this. Yours, Laurenz Albe PS: Here is the configuration: name | current_setting ------------------------------+--------------------------- version | PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3), 64-bit archive_command | gzip -1 <"%p" | tee /POSTGRES/data/exchange/"%f".gz >/POSTGRES/data/backups/ELAK/"%f".gz archive_mode | on checkpoint_completion_target | 0.9 checkpoint_segments | 30 client_encoding | UTF8 constraint_exclusion | off cursor_tuple_fraction | 1 custom_variable_classes | pg_stat_statements default_statistics_target | 1000 effective_cache_size | 64GB hot_standby | on lc_collate | de_DE.UTF8 lc_ctype | de_DE.UTF8 listen_addresses | * log_checkpoints | on log_connections | on log_destination | csvlog log_directory | /POSTGRES/data/logs/ELAK log_disconnections | on log_filename | ELAK-%Y-%m-%d.log log_lock_waits | on log_min_duration_statement | 3s log_min_error_statement | log log_min_messages | log log_rotation_size | 0 log_statement | ddl log_temp_files | 0 logging_collector | on maintenance_work_mem | 1GB max_connections | 800 max_prepared_transactions | 800 max_stack_depth | 9MB max_standby_archive_delay | 0 max_standby_streaming_delay | 0 max_wal_senders | 2 pg_stat_statements.max | 5000 pg_stat_statements.track | all port | 55503 server_encoding | UTF8 shared_buffers | 16GB shared_preload_libraries | pg_stat_statements,passwordcheck ssl | on tcp_keepalives_count | 0 tcp_keepalives_idle | 0 TimeZone | Europe/Vienna wal_buffers | 16MB wal_level | hot_standby work_mem | 8MB (49 rows) -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance