Hi Siddharth,
The slowness also depends on the data with its type(data type).
Typically it depends based on the amount of ram distributed between other processes and that of pg_restore.
Regards,
Pavan
On Mon, 6 May, 2019, 4:54 PM Siddharth Karandikar, <siddharth.karandikar@xxxxxxxxx> wrote:
Hello All,
This is my first and little longer email on this list. I wanted to
share all the information that I have gathered so far and it got
longer.
I am trying to restore database dump created on PostgreSQL 9.5.16 to
PostgreSQL 10.7 and it is taking a lot of time to restore. Restoring
the same dump to 9.5 doesn't take that much. So I am wondering what
could be reason behind this slowness.
Some numbers to show how slow it is:
Small tables (<500MB):
- 9.5 to 9.5 took 1m4.110s
- 9.5 to 10 took 1m4.575s
Medium tables (500MB to 5GB):
- 9.5 to 9.5 took 6m53.132s
- 9.5 to 10 took 6m0.998s
Large tables (5GB+):
- 9.5 to 9.5 took 131m35.106s
- 9.5 to 10 took 831m25.539s
Underlying hardware and OS used in both cases are exactly the same.
Thinking that there could be data format difference between Postgres
9.5 and 10 and that could be the reason to this slowness, I tried
dumping from Postgres 10 and restoring it on Postgres 10, but that is
also super slow.
Here are some numbers of that:
Small tables (<500MB):
- 10 to 10 took 0m51.248s
Medium tables (500MB to 5GB):
- 10 to 10 took 5m40.224s
Large tables (5GB+):
- 10 to 10 took 832m36.850s
Restore commands
The commands I am using to restore:
pg_restore -v -F d -n public -U <user> -d <database> -h localhost -j
4 --disable-triggers -e -L <file with table names>
/var/lib/pgsql/20190425/pg_dump_data
The hardware I am running on:
CPU: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz (2 cores)
Memory: 16GB
HDD: nvme 300GB
FS: ext4
Postgres configuration that I have on this setup:
shared_buffers = 128MB
work_mem = 1MB
maintenance_work_mem = 16MB
dynamic_shared_memory_type = posix
huge_pages = try
vacuum_cost_page_hit = 1
vacuum_cost_page_miss = 10
vacuum_cost_page_dirty = 20
bgwriter_lru_maxpages = 100
full_page_writes = on
seq_page_cost = 1.0
random_page_cost = 4.0
wal_level = minimal
wal_sync_method = fsync
wal_log_hints = off
wal_buffers = -1
wal_writer_delay = 200ms
min_wal_size = 1GB
max_wal_size = 80GB
max_wal_senders = 0
wal_keep_segments = 0
wal_sender_timeout = 60s
wal_receiver_status_interval = 10s
wal_receiver_timeout = 60s
autovacuum = on
log_autovacuum_min_duration = -1
autovacuum_max_workers = 3
autovacuum_naptime = 1min
autovacuum_vacuum_threshold = 50
autovacuum_analyze_threshold = 50
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_freeze_max_age = 200000000
autovacuum_multixact_freeze_max_age = 400000000
autovacuum_vacuum_cost_delay = 20ms
autovacuum_vacuum_cost_limit = -1
Some more things observed:
* Running strace on Postgres process showed continuous stream of
lseek/read/write calls and it also showed bunch of futex wait and
resume calls. These futex calls are not easily seen on Postgres 9.5
* Looking at pg_stat_activity table showed WALWriteLock under
wait_event column. It used to change but could see this coming up very
frequently.
Thanks,
Siddharth Karandikar
PS: I had posted this on stackoverflow, but didn't get lot of
attention there. So posting it here. Don't know if thats considered
cross-posting. But if thats the case, I would keep that in mind and
avoid it in future.