Good afternoon,
We've finally made the migration+upgrade from old server with v7.14 to
new server with v8.4 and, before test and plan an upgrade to v9.6, I'm
checking logs to find out if there's any problem with this upgrade.
We've been fixing things and in only remains one problem in the log that
I don't understand and not sure how to fix, or if it has to be fixed.
Original logs are not in english so I'm posting my translation.
In the morning we're making a backup of -let's call it- the main
database. To avoid intensive use of this main database, we have a second
db that's a copy of the main one, used only to display data to some
users that don't need it up to date and that shouldn't change anything.
So at night we're restoring the morning backup from main database into
this second one. The upgrade that I've mentioned has ended with both,
main and second databases, being in the same server. The "problem"
(because I'm not sure if it's really a problem) is that while the backup
is restoring in the second database, it seems like autovacuum is
launched and conflicts with this db restore. The log is this:
[...] Several messages about checkpoints running too frequently [...]
2019-10-16 23:01:30.904 CEST - [162851] - user@[local]:secondDB - LOG:
duration: 7446.139 ms sentence: COPY one_table (some_columns) FROM stdin;
2019-10-16 23:01:37.457 CEST - [13750] LOG: checkpoints are running too
frequently ( 9 seconds)
2019-10-16 23:01:37.457 CEST - [13750] HINT: Consider changing
«checkpoint_segments» configuration parameter.
2019-10-16 23:01:58.663 CEST - [162851] - user@[local]:secondDB - LOG:
duration: 6492.426 ms sentence: CREATE INDEX another_table_index1 ON
another_table USING btree (another_field1);
2019-10-16 23:02:04.042 CEST - [162851] - user@[local]:secondDB - LOG:
duration: 5378.886 ms sentence: CREATE INDEX another_table_index2 ON
another_table USING btree (another_field2);
2019-10-16 23:02:11.742 CEST - [162851] - user@[local]:secondDB - LOG:
duration: 7699.246 ms sentence: CREATE INDEX another_table_index3 ON
another_table USING btree (another_field3);
2019-10-16 23:02:12.742 CEST - [162851] - user@[local]:secondDB - LOG:
sending cancel signal to blocking autovacuum with PID 162869
2019-10-16 23:02:12.742 CEST - [162851] - user@[local]:secondDB -
DETAIL: Process 162851 is waiting for ShareLock on relation
3880125365 for database 3880125112.
2019-10-16 23:02:12.743 CEST - [162851] - user@[local]:secondDB -
SENTENCE: CREATE INDEX another_table_index4 ON another_table USING
btree (another_field4);
2019-10-16 23:02:12.743 CEST - [162869] ERROR: cancelling autovacuum task
2019-10-16 23:02:12.743 CEST - [162869] CONTEXT: automatic analyze of
«secondDB.public.another_table»
2019-10-16 23:02:20.899 CEST - [162851] - user@[local]:secondDB - LOG:
duration: 9157.371 ms sentence: CREATE INDEX another_table_index4 ON
another_table USING btree (another_field4);
After upgrade main db to the new server, I've tuned following parameters
in postgresql.conf (using pgtune recommendations)
max_connections = 200
shared_buffers = 8GB
work_mem = 2621kB
maintenance_work_mem = 2GB
effective_io_concurrency = 2
wal_buffers = 8MB
checkpoint_segments = 32
checkpoint_completion_target = 0.7
effective_cache_size = 16GB
log_min_duration_statement = 5000
log_line_prefix = '%m - [%p] %q- %u@%h:%d - %a '
standard_conforming_strings = on
I've been looking for the problem with checkpoints and I've decided to
let it be, because these messages only appear when we make the second db
restore. The rest of the log is clean from checkpoint messages.
But I don't understand why I'm getting those messages about autovacuum
blocking db restore process. I guess that after one table is created
with COPY sentence, as many rows have been inserted, autoanalyze process
runs to gather statistics for the Execution Planner. But why is
happening this block? Is autoanalyze running before the table gets fully
loaded? Is this really a problem? If so, how can I handle it? This task
is running at night, when nobody is using second database.
Thank you for reading,
Ekaterina