On Sep 14, 2021, at 1:42 AM, Tomas Pospisek <tpo2@xxxxxxxxxxxxx> wrote:
I used https://hevodata.com/learn/postgresql-logical-replication/, but I rather suspect there are better options, as I had a number of issues with that one. Essentially, it seems to boil down to the following steps: 1) Set up a new “slave” database. I used a pg_dump -s command piped through to psql to transfer schema only to the new DB 2) Configure both master and slave postgresql servers with wal_level=logical in the postgresql.conf file. Will require a server restart, which stinks if you need absolutely 0 downtime, but ideally should only take a few seconds. 3) On the master server, run the SQL command CREATE PUBLICATION publication_name FOR ALL TABLES; Presumably this should be issued while in the database you wish to replicate, and you can specify specific tables only if desired (though I haven’t looked into how). There is a note in that how-to about the published table needing a REPLICA IDENTITY to replicate DELETE and UPDATE operations, I still need to look into that more. For my case, however, I have an insert-only workflow, so it wasn’t an issue. 4) Finally, on the slave server, run the command CREATE SUBSCRIPTION subscription_name CONNECTION '...<connection string> ...' PUBLICATION my_publication; where connection string is something like: ‘host=my.remote.host port=5432 dbname=my_big_db' ...And that’s it, replication should kick off by syncing up the data from master to slave, and after that changes should be sent in real time. One gotcha that I ran into: my table used the postgis extension, which was set up on the slave DB when I ran the pg_dump commands. Creating that extension creates *and populates* a table named “spatial_ref_sys”. Since I had created the publication as “FOR ALL TABLES”, the replication tried to include this table. Which failed due to UNIQUE constraint violations. At which point replication stopped AS DID ALL INSERTS INTO THE MASTER DB! I deleted all records from said table on the slave DB, replication kicked in and re-populated it, and life went on, but I lost something like 14 hours of data thanks to this mistake. Guess that’s what I get for not keeping an eye on the process :-( Lesson learned: make sure any tables to be replicated are empty in the slave before issuing the CREATE SUBSCRIPTION command. For cutover, I plan the following, we’ll see how it goes: 1) move all read-only processes over to the new server. That can be done at any point after the above, with little or no downtime depending on your processes. 2) Reconfigure writing process to point to the slave, and stop them 3) Issue a DROP SUBSCRIPTION subscription_name on the slave, thereby “promoting” it to master 4) Restart writing processes, now pointed to the new master Steps 2-4 should happen in quick succession, resulting in only seconds of downtime. At least, that’s the theory. We’ll see how it goes (or if anyone else here can offer a better procedure!) --- Israel Brewster Software Engineer Alaska Volcano Observatory Geophysical Institute - UAF 2156 Koyukuk Drive Fairbanks AK 99775-7320 Work: 907-474-5172 cell: 907-328-9145
|