Search Postgresql Archives

Re: pg_rewind copy so much data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Thanks for your response. I have just replayed switching master and slave once again:

- one master and one slave (total size of each server is more than 4GB). Currently the last log of the slave is "started  streaming WAL from primary at 2/D6000000 on timeline 10".

- stop master, the slave show below logs:
          replication terminated by primary server
          End of WAL reached on timeline 10 at 2/D69304D0
          Invalid record length at 2/D69304D0
          could not connect to primary server

- promote the slave:
          receive promote request
          redo done at 2/D6930460
          selected new timeline ID: 11
          archive recovery complete
          MultiXact member wraparound protections are now enabled
          database system is ready to accept connections
          autovacuum launcher started

- start and stop old master, then run pg_rewind (all are executed immediately after promoting the slave). Logs of pg_rewind:
          servers diverged at WAL position 2/D69304D0 on timeline 10
          rewinding from last common checkpoint at 2/D6930460 on timeline 10
          reading source file list
          reading target file list
          reading WAL in target
          need to copy 4168 MB (total source directory is 4186 MB)
          4268372/4268372 kB (100%) copied
          creating backup label and updating control file
          syncing target data directory
          Done!

If I run pg_rewind with debug option, it just show additional bunch of files copied in directories like base or pg_tblspc. I claim that there is no data inserted of modified from the first step. The only difference between two server is caused by restarting old master.

Thanks and Regards,

Hung Phan



On Wed, Sep 13, 2017 at 10:48 AM, Michael Paquier <michael.paquier@xxxxxxxxx> wrote:
On Wed, Sep 13, 2017 at 12:41 PM, Hung Phan <hungphan227@xxxxxxxxx> wrote:
> I have tested pg_rewind (ver 9.5) with the following scenario:
>
> - one master and one slave (total size of each server is more than 4GB)
> - set wal_log_hint=on and restart both
> - stop master, promote slave
> - start old master again (now two servers have diverged)
> - stop old master, run pg_rewind with progress option

That's a good flow. Don't forget to run a manual checkpoint after
promotion to update the control file of the promoted standby so as
pg_rewind is able to identify the timeline difference between the
source and the target servers.

> The pg_rewind ran successfully but I saw it copied more than 4GB (4265891 kB
> copied). So I wonder there was very minor difference between two servers but
> why did pg_rewind copy almost all data of new master?

Without knowing exactly the list of things that have been registered
as things to copy from the active source to the target, it is hard to
give a conclusion. But my bet here is that you let the target server
online long enough that it had a bunch of block updated, causing more
relation blocks to be copied from the source because more efforts
would be needed to re-sync it. That's only an assumption without data
with clear numbers, numbers that could be found using the --debug
messages of pg_rewind.
--
Michael


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux