Search Postgresql Archives

Re: Fast logical replication jump start with PG 10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/26/2018 06:23 AM, Olivier Gautherot wrote:
Hi Adrian!

On Sat, May 26, 2018 at 12:11 AM, Adrian Klaver <adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>> wrote:

    On 05/25/2018 06:35 PM, Olivier Gautherot wrote:

        Hi Adrian, thanks for your reply. Here is the clarification.

        1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the
        test machine, it runs in between 15 and 20 minutes for just over
        100GB. I can negotiate this time with our customer. The vacuum
        process took another 5 to 7 minutes. This this what I was
        referring to with the 30 minutes (point 3 in your questions)

        2) After pg_upgrade, I published the tables on the database (in
        the sense "CREATE DATABASE") and subscribed to this publication
        on the second server (logical replication). The data copy
        processed started immediately and took around 1 hour. I then
        loaded the indexes, what took > another 2h20m. At that point the
        active-passive cluster was ready to go.


    The index creation was done on the replicated machine I presume,
    using what command?


The sequence on the replicated machine was (pseudo-code to simplify the syntax): - pg_dump --section=pre-data -h master_machine master_database | psql -h replication_machine replication_database
# This took seconds, "pre-data" discards the indexes

- psql -h replication_machine -c "CREATE SUBSCRIPTION mysub CONNECTION "..." PUBLICATION mypub;" replication_database
# This took about 1 hour for the initial sync

- pg_dump --section=post-data -h master_machine master_database | psql -h replication_machine replication_database
# This took 2h20m to load the various indexes

This sequence follows the recommendation of section 14.4.3 in https://www.postgresql.org/docs/10/static/populate.html . If I stick to streaming as we do today (e.g. pg_upgrade and then rsync to the replication server), I can be ready in about 1 hour (more acceptable for the customer).


Just realized that by setting up the streaming as above you are already doing basically the same thing as I suggested in my previous post. Streaming and logical replication can exist at the same time:

https://www.postgresql.org/docs/10/static/logical-replication.html

"Logical replication is a method of replicating data objects and their changes, based upon their replication identity (usually a primary key). We use the term logical in contrast to physical replication, which uses exact block addresses and byte-by-byte replication. PostgreSQL supports both mechanisms concurrently, see Chapter 26. Logical replication allows fine-grained control over both data replication and security."

So you could set up the logical replication after the streaming is done using the copy_data=false clause and been done in a relatively short period of time. At that point you could decide whether to keep the streaming running or not.


--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux