Thanks for the clarification. Looks like I am running into a different issue: while trying to pin down precisely the steps (and the order in which to perform them) needed to remove/rejoin a node, the removal/rejoining
exercise was repeated a number of times, and stuck again:
2017-07-12 10:37:46 PDT [24943:bdr (6334686800251932108,1,43865,):receive:::1(33883)]DEBUG: 00000: received replication command: IDENTIFY_SYSTEM 2017-07-12 10:37:46 PDT [24943:bdr (6334686800251932108,1,43865,):receive:::1(33883)]LOCATION: exec_replication_command, walsender.c:1309 2017-07-12 10:37:46 PDT [24943:bdr (6334686800251932108,1,43865,):receive:::1(33883)]DEBUG: 08003: unexpected EOF on client connection 2017-07-12 10:37:46 PDT [24943:bdr (6334686800251932108,1,43865,):receive:::1(33883)]LOCATION: SocketBackend, postgres.c:355 2017-07-12 10:37:46 PDT [24944:bdr (6408408103171110238,1,24713,):receive:::1(33884)]DEBUG: 00000: received replication command: IDENTIFY_SYSTEM 2017-07-12 10:37:46 PDT [24944:bdr (6408408103171110238,1,24713,):receive:::1(33884)]LOCATION: exec_replication_command, walsender.c:1309 2017-07-12 10:37:46 PDT [24944:bdr (6408408103171110238,1,24713,):receive:::1(33884)]DEBUG: 08003: unexpected EOF on client connection 2017-07-12 10:37:46 PDT [24944:bdr (6408408103171110238,1,24713,):receive:::1(33884)]LOCATION: SocketBackend, postgres.c:355 2017-07-12 10:37:46 PDT [24946:bdr (6334686760735153516,1,43845,):receive:::1(33885)]DEBUG: 00000: received replication command: IDENTIFY_SYSTEM 2017-07-12 10:37:46 PDT [24946:bdr (6334686760735153516,1,43845,):receive:::1(33885)]LOCATION: exec_replication_command, walsender.c:1309 2017-07-12 10:37:46 PDT [24946:bdr (6334686760735153516,1,43845,):receive:::1(33885)]DEBUG: 08003: unexpected EOF on client connection 2017-07-12 10:37:46 PDT [24946:bdr (6334686760735153516,1,43845,):receive:::1(33885)]LOCATION: SocketBackend, postgres.c:355 2017-07-12 10:37:49 PDT [24949:bdr (6394432535408825526,1,37325,):receive:::1(33892)]DEBUG: 00000: received replication command: IDENTIFY_SYSTEM 2017-07-12 10:37:49 PDT [24949:bdr (6394432535408825526,1,37325,):receive:::1(33892)]LOCATION: exec_replication_command, walsender.c:1309 2017-07-12 10:37:49 PDT [24949:bdr (6394432535408825526,1,37325,):receive:::1(33892)]DEBUG: 08003: unexpected EOF on client connection 2017-07-12 10:37:49 PDT [24949:bdr (6394432535408825526,1,37325,):receive:::1(33892)]LOCATION: SocketBackend, postgres.c:355 What do these entries say? and what can be done to correct the situation (there have been no change with respect to either postgres or network configuration in the remove/rejoin
exercise)? Thanks From: Craig Ringer [mailto:craig@xxxxxxxxxxxxxxx]
On 11 July 2017 at 05:49, Zhu, Joshua <jzhu@xxxxxxxxxxxxx> wrote:
BDR1 requires that you manually remove the bdr.bdr_nodes entry if you intend to re-use the same node name.
-- Craig Ringer
http://www.2ndQuadrant.com/ |