Hello!
Thank you for your suggestion.
I afraid this approach is not suitable for me. As a rule my postgresql
log on subscriber side contains a bunch of the following entries:
ERROR: terminating logical replication worker due to timeout
00000 LOG: worker process:
logical replication worker for subscription 24578 (PID 6217) exited with
exit code 1
How should I handle this situation?
As I understand this is quite normal situation. But why is severity
for it an ERROR ?
I have another assumption. Could you correct me if I am wrong.
I found out in the source code that logical replication worker termination
depends on wal_receiver_timeout paramer.
So I propose setting wal_receiver_timeout to 0.
In this case I think that monitoring of the following views pg_stat_subscription,
pg_publication and pg_stat_replication is enough.
In case if there is some problem
with connection or with replication pg_stat_replication will
show nothing because wal sender will not be working otherwise it will give
some information.
Am I right? Are there any vulnerabilities in this approach ?
Best regards,
Andrei Yahorau
From:
Andrei Yahorau/IBA
To:
pgsql-admin@xxxxxxxxxxxxxx,
Cc:
Mikalai Keida/IBA@IBA
Date:
10/08/2018 13:05
Subject:
Logical replication
monitoring
Hello PostgreSQL Community!
I configured logical replication for
PostgreSQL 10.4 on 2 machines, set wal_level to logical, created a publication
on master node and created a subscription on standby node according to
the PostgreSQL documentation.
Could you please suggest an approach
for replication state monitoring.
According to my experience the monitoring
of pg_stat_subscription and pg_publication, pg_replication_slots
unfortunately is not enough for this aim. Moreover standby database
does not prohibit write operations by default and it can lead to some inconsistency
between these databases.
For example a chain of queries as
SELECT pg_is_is_recovery(),
SELECT * FROM pg_stat_replication
and
SELECT * FROM pg_stat_wal_receiver
provide insight into replication state
for hot_standby replication.
So is there a reliable way of replication
state monitoring for logical replication?
Best regards,
Andrei Yahorau