Search Postgresql Archives

Re: terminating walsender process due to replication timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,
Thank You for the response.

Yes that's possible to monitor replication delay. But my questions were not about monitoring network issues.

I use exactly wal_sender_timeout=1s because it allows to detect replication problems quickly.
So, I need clarification to the following  questions:
Is  it possible to use exactly this configuration and be sure that it will be work properly.
What did I do wrong? Should I correct my configuration somehow?
Is this the same issue  as mentioned here:
https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832510@xxxxxxxxxxxxxxx ? If it is so, why I do I face this problem again?

Thank you in advance.
Best regards,
Andrei





From:        Rene Romero Benavides <rene.romero.b@xxxxxxxxx>
To:        AYahorau@xxxxxxxxxxx,
Cc:        Postgres General <pgsql-general@xxxxxxxxxxxxxx>
Date:        14/05/2019 20:12
Subject:        Re: terminating walsender process due to replication timeout




To detect network issues maybe you could monitor replication delay.

On Mon, May 13, 2019 at 6:42 AM <AYahorau@xxxxxxxxxxx> wrote:
Hello PostgreSQL Community!

I faced an issue on my linux machine using Postgres 11.3 .

I have 2 nodes in db cluster: master and standby.

I tried to perform a plenty of long-running  queries which lead to the databases desynchronization:

terminating walsender process due to replication timeout


Here is the output in debug mode:

2019-05-13 13:21:33 FET 00000 DEBUG:  sending replication keepalive

2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0

2019-05-13 13:21:34 FET 00000 LOG:  terminating walsender process due to replication timeout



The issue is reproducible. I configure 2 nodes cluster, download demo_small.zip from
https://edu.postgrespro.ru/ and run the following command:
psql -U user1 -f demo_small.sql db1

and I get the observed behaviour.



I know that I can increase wal_sender_timeout value to avoid this behaviour (currently wal_sender_timeout is equal to 1 second.)

To be honest I don't want to increase wal_sender_timeout because I would like to detect some network issues quickly.


After having googled I found that someone faced a similar issue
https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832510@xxxxxxxxxxxxxxx which was fixed in  PostgreSQL 9.4.16.


Is my issue the same as described here
https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832510@xxxxxxxxxxxxxxx ?
Is there any  other chance to avoid it without increasing wal_sender_timeout?



Thank you in advance.

Regards,
Andrei



--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison

http://pglearn.blogspot.mx/


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux