At Orcid we're trying to upgrade our Postgres database (10 to 13) using pg_logical for no downtime. The problem we have is how long the initial copy is taking for the ~500GB database. If it takes say 20days to complete, will we need to have 20days of WAL files to start catching up when it's complete?
I read an earlier thread which pointed me to the tool pglogical_create_subscriber which does a pg_basebackup to start the initial replication but this is only going to be useful for logical clusters on the same version.
I had hoped that the COPY could be parallelized more by "max_sync_workers_per_subscription" which is set to 2. However there's only a single process:-
postgres 1022196 6.0 24.5 588340 491564 ? Ds Sep22 193:19 postgres: main: xxx xxxx 10.xx.xx.xx(59144) COPY
One of the best resources I've found of real world examples are thead on gitlabs own gitlab about their Postgres migrations. They discussed one method that might work:-
1. Setup 9.6 secondary via streaming
2. Turn physical secondary into logical secondary
3. Shutdown and upgrade secondary to 10
4. Turn secondary back on.
2. Turn physical secondary into logical secondary
3. Shutdown and upgrade secondary to 10
4. Turn secondary back on.
In which case we would only need the time required to perform the upgrade.
--
Giles Westwood
Senior Devops Engineer, ORCID