On 3/13/20 4:11 AM, Nicola Contu wrote:
So in the logs I now see this :
2020-03-13 11:03:42 GMT [10.150.20.22(45294)] [27804]: [1-1]
db=[unknown],user=replicator LOG: terminating walsender process due to
replication timeout
Yeah that's been showing up the log snippets you have been posting.
To figure this out you will need to:
1) Make a list of what changed since the last time replication worked
consistently.
2) Monitor the changed components, start logging or increase logging.
3) Monitor the chain of replication as whole, to catch changes that you
do not know about. Since you seem to be operating across data centers
that would include verifying the network.
So I tried increasing the wal_sender_timeout to 300s but it did not help
Il giorno gio 12 mar 2020 alle ore 15:56 Nicola Contu
<nicola.contu@xxxxxxxxx <mailto:nicola.contu@xxxxxxxxx>> ha scritto:
The encryption is at os level. So the drives are encrypted with a
password where the db saves data
Il gio 12 mar 2020, 15:51 Adrian Klaver <adrian.klaver@xxxxxxxxxxx
<mailto:adrian.klaver@xxxxxxxxxxx>> ha scritto:
On 3/12/20 4:31 AM, Nicola Contu wrote:
> The replicator is ok and the replicated as well.
> %Cpu(s): 0.2 us, 1.0 sy, 0.0 ni, 94.8 id, 4.0 wa, 0.0
hi, 0.0 si,
> 0.0 st
>
> CPU is really low on both.
>
> I am running pg_basebackup again everytime.
> Any other suggestions?
>
I have to believe their is a connection between changing to
encrypting
the disks and your issues. Not sure what, but to help how is the
encryption being done and what program is being used?
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx