On 5/22/23 16:20, Jeff Ross wrote:
Hello!
We are moving from 10 to 15 and are in testing now.
Our development database is about 1400G and takes 12 minutes to complete
a pg_upgrade with the -k (hard-links) version. This is on a CentOS 7
server with 80 cores.
Adding -j 40 to use half of those cores also finishes in 12 minutes and
ps / top/ htop never show more than a single process at a time in use.
Bumping that to -j 80 to use them all also finishes in 12 minutes and
still only a single process.
Running the suggested vacuum analyze after pg_upgrade completes takes
about 19 minutes. Adding -j 40 takes that time down to around 5
minutes, jumps the server load up over 30 and htop shows 40 processes.
If -j 40 helps there--why not with pg_upgrade?
From docs:
https://www.postgresql.org/docs/current/pgupgrade.html
The --jobs option allows multiple CPU cores to be used for
copying/linking of files and to dump and restore database schemas in
parallel; a good place to start is the maximum of the number of CPU
cores and tablespaces. This option can dramatically reduce the time to
upgrade a multi-database server running on a multiprocessor machine.
So is the 1400G mostly in one database in the cluster?
The full commands we are using for pg_upgrade are pretty stock:
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 40
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 80
Our production database is closer to 1900G. If we're looking at a 30
minute pg_upgrade window we'll be okay but if there is anything we can
do to knock that time down we will and any suggestions to do so would be
greatly appreciated.
Jeff Ross
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx