Re: Replaying 48 WAL files takes 80 minutes

"ktm@xxxxxxxx" <ktm@xxxxxxxx> · Mon, 29 Oct 2012 08:19:17 -0500

On Mon, Oct 29, 2012 at 02:05:24PM +0100, Albe Laurenz wrote:
> I am configuring streaming replication with hot standby
> with PostgreSQL 9.1.3 on RHEL 6 (kernel 2.6.32-220.el6.x86_64).
> PostgreSQL was compiled from source.
> 
> It works fine, except that starting the standby took for ever:
> it took the system more than 80 minutes to replay 48 WAL files
> and connect to the primary.
> 
> Can anybody think of an explanation why it takes that long?
> 
> This is decent hardware: 24 cores of AMD Opteron 6174, 128 GB RAM,
> NetApp SAN attached with 8 GBit Fibrechannel (ext4 file system).
> An identical system performed fine in performance tests.
> 
> Here is the log; I have edited it for readability:
> 
> 2012-10-29 09:22:22.945  database system was interrupted; last known up
> at 2012-10-26 01:11:59 CEST
> 2012-10-29 09:22:22.945  creating missing WAL directory
> "pg_xlog/archive_status"
> 2012-10-29 09:22:22.947  entering standby mode
> 2012-10-29 09:22:23.434  restored log file "00000001000001D1000000C4"
> from archive
> 2012-10-29 09:22:23.453  redo starts at 1D1/C4000020
> 2012-10-29 09:22:25.847  restored log file "00000001000001D1000000C5"
> from archive
> 2012-10-29 09:22:27.457  restored log file "00000001000001D1000000C6"
> from archive
> 2012-10-29 09:22:28.946  restored log file "00000001000001D1000000C7"
> from archive
> 2012-10-29 09:22:30.421  restored log file "00000001000001D1000000C8"
> from archive
> 2012-10-29 09:22:31.243  restored log file "00000001000001D1000000C9"
> from archive
> 2012-10-29 09:22:32.194  restored log file "00000001000001D1000000CA"
> from archive
> 2012-10-29 09:22:33.169  restored log file "00000001000001D1000000CB"
> from archive
> 2012-10-29 09:22:33.565  restored log file "00000001000001D1000000CC"
> from archive
> 2012-10-29 09:23:35.451  restored log file "00000001000001D1000000CD"
> from archive
> 
> Everything is nice until here.
> Replaying this WAL file suddenly takes 1.5 minutes instead
> of mere seconds as before.
> 
> 2012-10-29 09:24:54.761  restored log file "00000001000001D1000000CE"
> from archive
> 2012-10-29 09:27:23.013  restartpoint starting: time
> 2012-10-29 09:28:12.200  restartpoint complete: wrote 242 buffers
> (0.0%);
>                          0 transaction log file(s) added, 0 removed, 0
> recycled;
>                          write=48.987 s, sync=0.185 s, total=49.184 s;
>                          sync files=1096, longest=0.016 s, average=0.000
> s
> 2012-10-29 09:28:12.206  recovery restart point at 1D1/CC618278
> 2012-10-29 09:28:31.226  restored log file "00000001000001D1000000CF"
> from archive
> 
> Again there is a difference of 2.5 minutes
> between these WAL files, only 50 seconds of
> which were spent in the restartpoint.
> 
> From here on it continues in quite the same vein.
> Some WAL files are restored in seconds, but some take
> more than 4 minutes.
> 
> I'll skip to the end of the log:
> 
> 2012-10-29 10:37:53.809  restored log file "00000001000001D1000000EF"
> from archive
> 2012-10-29 10:38:53.194  restartpoint starting: time
> 2012-10-29 10:39:25.929  restartpoint complete: wrote 161 buffers
> (0.0%);
>                          0 transaction log file(s) added, 0 removed, 0
> recycled;
>                          write=32.661 s, sync=0.066 s, total=32.734 s;
>                          sync files=251, longest=0.003 s, average=0.000
> s
> 2012-10-29 10:39:25.929  recovery restart point at 1D1/ED95C728
> 2012-10-29 10:42:56.153  restored log file "00000001000001D1000000F0"
> from archive
> 2012-10-29 10:43:53.062  restartpoint starting: time
> 2012-10-29 10:45:36.871  restored log file "00000001000001D1000000F1"
> from archive
> 2012-10-29 10:45:39.832  restartpoint complete: wrote 594 buffers
> (0.0%);
>                          0 transaction log file(s) added, 0 removed, 0
> recycled;
>                          write=106.666 s, sync=0.093 s, total=106.769 s;
>                          sync files=729, longest=0.004 s, average=0.000
> s
> 2012-10-29 10:45:39.832  recovery restart point at 1D1/EF5D4340
> 2012-10-29 10:46:13.602  restored log file "00000001000001D1000000F2"
> from archive
> 2012-10-29 10:47:38.396  restored log file "00000001000001D1000000F3"
> from archive
> 2012-10-29 10:47:38.962  streaming replication successfully connected to
> primary
> 
> I'd be happy if somebody could shed light on this.
> 
> Yours,
> Laurenz Albe
> 
> PS: Here is the configuration:
> 
>              name             | current_setting 
> ------------------------------+---------------------------
>  version                      | PostgreSQL 9.1.3 on
> x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20110731 (Red Hat
> 4.4.6-3), 64-bit
>  archive_command              | gzip -1 <"%p" | tee
> /POSTGRES/data/exchange/"%f".gz >/POSTGRES/data/backups/ELAK/"%f".gz
>  archive_mode                 | on
>  checkpoint_completion_target | 0.9
>  checkpoint_segments          | 30
>  client_encoding              | UTF8
>  constraint_exclusion         | off
>  cursor_tuple_fraction        | 1
>  custom_variable_classes      | pg_stat_statements
>  default_statistics_target    | 1000
>  effective_cache_size         | 64GB
>  hot_standby                  | on
>  lc_collate                   | de_DE.UTF8
>  lc_ctype                     | de_DE.UTF8
>  listen_addresses             | *
>  log_checkpoints              | on
>  log_connections              | on
>  log_destination              | csvlog
>  log_directory                | /POSTGRES/data/logs/ELAK
>  log_disconnections           | on
>  log_filename                 | ELAK-%Y-%m-%d.log
>  log_lock_waits               | on
>  log_min_duration_statement   | 3s
>  log_min_error_statement      | log
>  log_min_messages             | log
>  log_rotation_size            | 0
>  log_statement                | ddl
>  log_temp_files               | 0
>  logging_collector            | on
>  maintenance_work_mem         | 1GB
>  max_connections              | 800
>  max_prepared_transactions    | 800
>  max_stack_depth              | 9MB
>  max_standby_archive_delay    | 0
>  max_standby_streaming_delay  | 0
>  max_wal_senders              | 2
>  pg_stat_statements.max       | 5000
>  pg_stat_statements.track     | all
>  port                         | 55503
>  server_encoding              | UTF8
>  shared_buffers               | 16GB
>  shared_preload_libraries     | pg_stat_statements,passwordcheck
>  ssl                          | on
>  tcp_keepalives_count         | 0
>  tcp_keepalives_idle          | 0
>  TimeZone                     | Europe/Vienna
>  wal_buffers                  | 16MB
>  wal_level                    | hot_standby
>  work_mem                     | 8MB
> (49 rows)
> 
> 
Hi Albe,

My first guess would be that there was something using I/O resources on your
NetApp. That is the behavior you would expect once the I/O cache on the NetApp
has been filled and you actually have to perform writes to the underlying
disks. Is this a dedicated box? Can you get I/O stats from the box during the
recovery?

Regards,
Ken
> -- 
> Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
> 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance