Re: Manage slot in logical/pglogical replication

Alvaro Herrera <alvherre@xxxxxxxxxxxxxx> · Thu, 13 Jul 2017 16:50:51 -0400

--- Begin Message ---

Subject: Re:Manage slot in logical/pglogical replication
From: ghiureai <isabella.ghiurea@xxxxxxxxxxxxxx>
Date: Thu, 13 Jul 2017 13:43:41 -0700
In-reply-to: <E1dVgkB-0005Jn-0I@malur.postgresql.org>
References: <E1dVgkB-0005Jn-0I@malur.postgresql.org>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

Hi List,
  I'm m extremely please to  see the logical replication aka: 
transaction replication feature been implemented in Pg 10,  very nice 
work done by the contrib of this module/feature!
  Since  here is mentioned the "replication slots" are located on 
master replication host, if the master goes offline unexpected what 
will  be state of slave server :
-will accept  transactions( writes connection)  ?
 will rollback the uncommitted transactions? and move to read-write 
state/promote as new  master? ( re cgf slots)
-should the DBA consider cfg replication slots on different  host than 
master ( or at least mirror to a 3 side) ?

Thank you
Isabella
-
On 07/13/2017 09:11 AM, pgsql-general-owner@xxxxxxxxxxxxxx wrote:
Message Digest
Volume 1 : Issue 15352 : "text" Format

Messages in this Issue:
   Re: Systemd support (was:Re: Please say it isn't so)
   Re: Systemd support (was:Re: Please say it isn't so)
   Re: Manage slot in logical/pglogical replication
   I can't cancel/terminate query.
   Re: Get table OID
   Re: BDR node removal and rejoin
   Re: I can't cancel/terminate query.

----------------------------------------------------------------------

Date: Thu, 13 Jul 2017 10:49:01 -0400
From: Vick Khera <vivek@xxxxxxxxx>
To: pgsql-general@xxxxxxxxxxxxxx
Subject: Re: Systemd support (was:Re: Please say it isn't so)
Message-ID: <CALd+dcf+FxqsSUrkPZvzhLNBwzxS8opTRSFKeLA8XTn62c9phA@xxxxxxxxxxxxxx>

What exactly does the configure flag to enable systemd support do? It seems
to me that building software to the systemd platform is just the same as
building it for windows vs unix or any other platform. One can only hope it
doesn't cause the others to wither away.

On Wed, Jul 12, 2017 at 3:20 AM, Mark Morgan Lloyd <
markMLl.pgsql-general@xxxxxxxxxxxxxxx> wrote:

On 12/07/17 05:00, Steve Litt wrote:

Hi all,

Please tell me this is a mistake:

https://wiki.postgresql.org/wiki/Systemd

Why a database system should care about how processes get started is
beyond me. Systemd is an entangled mess that every year subsumes more
and more of the operating system, in a very non-cooperative way.

There are almost ten init systems. In every one of those init systems,
one can run a process supervisor, such as runit or s6 or
daemontools-encore, completely capable of starting the postgres server.

Every year, systemd further hinders interoperability, further erodes
interchangeability of parts, and continues to address problems with
WONTFIX. In the long run, you do your users no favor by including
init-system specific code in Postgres or its makefiles. If systemd
can't correctly start Postgres, I guarantee you that s6 or runit,
running on top of systemd, can.

Postgres doesn't care which language makes a query to it. Why
should Postgres care which init system started it? I hope you can free
Postgres of init-specific code, and if for some reason you can't do
that, at least don't recommend init-specific code.

OTOH since systemd is what's being supported by a significant number of
distributions it makes sense to at least try to work robustly with it.

While my preference would have been to have made such a change at a major
version transition, the reality is that database systems are competitive,
and not keeping up with the underlying platform would have been very much
to PostgreSQL's disadvantage,

OP: Please note that you do yourself no favours at all by posting a
subject line which could very easily be misinterpreted as spam.

--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Attachment of type text/html removed.]

------------------------------

Date: Thu, 13 Jul 2017 11:24:10 -0400
From: Tom Lane <tgl@xxxxxxxxxxxxx>
To: Vick Khera <vivek@xxxxxxxxx>
Cc: pgsql-general@xxxxxxxxxxxxxx
Subject: Re: Systemd support (was:Re: Please say it isn't so)
Message-ID: <17764.1499959450@xxxxxxxxxxxxx>

Vick Khera <vivek@xxxxxxxxx> writes:
What exactly does the configure flag to enable systemd support do?
Not a lot.  A quick grep for USE_SYSTEMD says it does nothing except
add code in the postmaster to report ready/not-ready state transitions
by calling sd_notify().  We have significantly more lines of
documentation concerning systemd than we do code.

                         regards, tom lane

------------------------------

Date: Thu, 13 Jul 2017 11:55:31 -0400
From: Alvaro Herrera <alvherre@xxxxxxxxxxxxxxx>
To: dpat <denni.pat@xxxxxxxxx>
Cc: pgsql-general@xxxxxxxxxxxxxx
ubject: Re: Manage slot in logicaS
l/pglogical replication
Message-ID: <20170713155531.hfzfxdvibw46bn6x@alvherre.pgsql>

dpat wrote:

i have configure a master-replica replication with new pglogical 2.0.
I have to replicate data over MPLS/VPN, so there is a possibility that the
link temporarily interrupts.
I know that you have to be accurately estimated pg_xlog folder.
How can I handle the prolonged interruption of the link?
Can I just extend the folder? Or drop the slot?
Yeah, data accumulates in the origin side ("master"), so you need to
make sure you have sufficient room in pg_xlog to hold all the data
generated during the network interruption.

Dropping the slot would of course release the disk space, but it would
mean that you'd have to re-create the replica afterwards.  I suppose you
could have pg_xlog large enough to hold as much data as possible for a
mid-sized network interruption, and drop the slot as a last resource if
a connection interrupts for long enough that you risk causing
reliability problems in the origin.

I can create a consumer who temporarily saves the data in an external file?
I don't think so.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

------------------------------

Date: Thu, 13 Jul 2017 10:57:46 -0500
From: Edmundo Robles <edmundo@xxxxxxxxxxxx>
To: pgsql-general@xxxxxxxxxxxxxx
Subject: I can't cancel/terminate query.
Message-ID: <CAOXzpYC-nQ+_B8jUfe5-MZS4cuSH5hrQZwu_QQjLLq-EyKGc3g@xxxxxxxxxxxxxx>

Hi!  i have  many  too long time queries,  the oldest  is  almost 16 days,
so i  tried to cancel and terminate with pg_cancel_backend and
pg_terminate_backend  but queries is still running.

STIME     ELAPSED ELAPSED %CPU   PID COMMAND
jun27 15-23:05:46 1379146  0.3 29660 postgres: argos_admin bdkairos
127.0.0.1(55605) SELECT
jun27 15-23:05:46 1379146  0.3 29659 postgres: argos_admin bdkairos
127.0.0.1(55604) SELECT
jun27 15-23:05:46 1379146  0.3 29658 postgres: argos_admin bdkairos
127.0.0.1(55603) SELECT
jun27 15-23:05:16 1379116  0.3 29909 postgres: argos_admin bdkairos
127.0.0.1(55767) SELECT
jun27 15-23:04:51 1379091  0.3 30097 postgres: argos_admin bdkairos
127.0.0.1(55909) SELECT
jun27 15-23:03:49 1379029  0.3 30661 postgres: argos_admin bdkairos
127.0.0.1(56303) SELECT
...
What should  i do to safely close that  queries, before to use kill command
in linux?

I have Postgres 9.4  running on debian jessie.

--

[Attachment of type text/html removed.]

------------------------------

Date: Thu, 13 Jul 2017 12:00:14 -0400
From: Melvin Davidson <melvin6925@xxxxxxxxx>
To: Igor Korot <ikorot01@xxxxxxxxx>
Cc: "pgsql-general@xxxxxxxxxxxxxx" <pgsql-general@xxxxxxxxxxxxxx>
Subject: Re: Get table OID
Message-ID: <CANu8FixnW+_ph0MEdgCOUrAyzo68sKavU=zKfWX==PdJoFRyLQ@xxxxxxxxxxxxxx>

On Thu, Jul 13, 2017 at 11:06 AM, Igor Korot <ikorot01@xxxxxxxxx> wrote:

Hi, Melvin,

On Thu, Jul 13, 2017 at 10:42 AM, Melvin Davidson <melvin6925@xxxxxxxxx>
wrote:

On Thu, Jul 13, 2017 at 10:36 AM, Igor Korot <ikorot01@xxxxxxxxx> wrote:

  Hi, ALL,
Is it possible to get the table ID (or OID) from information_schema
somewhere?

Thank you.

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

It 's in pg_class!

https://www.postgresql.org/docs/9.4/static/catalog-pg-class.html

But pg_class is in postgres database.
Can I do something like this:

SELECT oid FROM postgres.pg_class WHERE relname = <my_table_name>;

Or I will have to connect to postgres database?

Thank you.

IOW:
SELECT relname, oid  FROM pg_class WHERE relkind = 'r' ORDER BY 1;
--
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

*Igor,*

*You do not need to specify "postgres" schema (postgres.pg_class). That is
wrong anyway.*

*FYI, the correct schema is pg_catalog, but  All postgres CATALOGS are
always available regardless of which database you are connected to.*

*and it is in the default search path, so you do not have to specify it.*

*Just do:SELECT oid    FROM pg_class  WHERE relname = <my_table_name>;*

*It will work just fine!  I highly encourage you to RTFM.*

--
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

[Attachment of type text/html removed.]

------------------------------

Date: Thu, 13 Jul 2017 16:09:22 +0000
From: "Zhu, Joshua" <jzhu@xxxxxxxxxxxxx>
To: Craig Ringer <craig@xxxxxxxxxxxxxxx>
Cc: "pgsql-general@xxxxxxxxxxxxxx" <pgsql-general@xxxxxxxxxxxxxx>
Subject: Re: BDR node removal and rejoin
Message-ID: <f5ac30a5684548e5ba672810e528c80b@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>

Found these log entries from one of the other node:

t=2017-07-13 08:35:34 PDT p=27292 a=DEBUG:  00000: found valid replication identifier 15

t=2017-07-13 08:35:34 PDT p=27292 a=LOCATION:  bdr_establish_connection_and_slot, bdr.c:604

t=2017-07-13 08:35:34 PDT p=27292 a=ERROR:  53400: no free replication state could be found for 15, increase max_replication_slots

Increased max_replication_slots, things are looking good now, thanks.

This does bring up a couple of questions:

   1.  Given the fact there is no real increase in the number of nodes in this repeated removal/rejoining exercise, yet it caused replication slots being used up, wouldnâ€™t removal of a node also automatically free up the replication slot allocated for the node? Or is there a way to manually free up no longer needed slots? (the donâ€™t seem to show up in pg_replication_slots view, I made sure to use pg_drop_replication_slot when they do show up there)

   2.  If there is such a thing, what is the rule of thumb for best value of max_replication_slots (are they somehow related to the value max_wal_senders as well), with respect to, say, the max number of nodes intended to support?

Thanks

From: Craig Ringer [mailto:craig@xxxxxxxxxxxxxxx]

Sent: Wednesday, July 12, 2017 11:59 PM

To: Zhu, Joshua <jzhu@xxxxxxxxxxxxxx>

Cc: pgsql-general@xxxxxxxxxxxxxx

Subject: Re:  BDR node removal and rejoin

On 13 July 2017 at 01:56, Zhu, Joshua <jzhu@xxxxxxxxxxxxx<mailto:jzhu@xxxxxxxxxxxxx>> wrote:

Thanks for the clarification.

Looks like I am running into a different issue: while trying to pin down precisely the steps (and the order in which to perform them) needed to remove/rejoin a node, the removal/rejoining exercise was repeated a number of times, and stuck again:

   1.  The status of the re-joining node (node4) on other nodes is â€œIâ€?

   2.  The status of the re-joining node on the node4 itself started at â€œIâ€?, changed to â€œoâ€?, then stuck there

   3.  From the log file for node4, the following entries are constantly being generated:

2017-07-12 10:37:46 PDT [24943:bdr (6334686800251932108,1,43865,):receive:::1(33883)]DEBUG:  00000: received replication command: IDENTIFY_SYSTEM

2017-07-12 10:37:46 PDT [24943:bdr (6334686800251932108,1,43865,):receive:::1(33883)]LOCATION:  exec_replication_command, walsender.c:1309

2017-07-12 10:37:46 PDT [24943:bdr (6334686800251932108,1,43865,):receive:::1(33883)]DEBUG:  08003: unexpected EOF on client connection

2017-07-12 10:37:46 PDT [24943:bdr (6334686800251932108,1,43865,):receive:::1(33883)]LOCATION:  SocketBackend, postgres.c:355

2017-07-12 10:37:46 PDT [24944:bdr (6408408103171110238,1,24713,):receive:::1(33884)]DEBUG:  00000: received replication command: IDENTIFY_SYSTEM

2017-07-12 10:37:46 PDT [24944:bdr (6408408103171110238,1,24713,):receive:::1(33884)]LOCATION:  exec_replication_command, walsender.c:1309

2017-07-12 10:37:46 PDT [24944:bdr (6408408103171110238,1,24713,):receive:::1(33884)]DEBUG:  08003: unexpected EOF on client connection

Check the logs on the other end.

--

  Craig Ringer                   http://www.2ndQuadrant.com/

  PostgreSQL Development, 24x7 Support, Training & Services

[Attachment of type text/html removed.]

------------------------------

Date: Thu, 13 Jul 2017 12:10:20 -0400
From: Melvin Davidson <melvin6925@xxxxxxxxx>
To: Edmundo Robles <edmundo@xxxxxxxxxxxx>
Cc: "pgsql-general@xxxxxxxxxxxxxx" <pgsql-general@xxxxxxxxxxxxxx>
Subject: Re: I can't cancel/terminate query.
Message-ID: <CANu8FiyH_+Uu09G2ZcC2FaE+52nh3fB5fy7DiY0-nbKdvtEbUg@xxxxxxxxxxxxxx>

On Thu, Jul 13, 2017 at 11:57 AM, Edmundo Robles <edmundo@xxxxxxxxxxxx>
wrote:

Hi!  i have  many  too long time queries,  the oldest  is  almost 16 days,
so i  tried to cancel and terminate with pg_cancel_backend and
pg_terminate_backend  but queries is still running.

STIME     ELAPSED ELAPSED %CPU   PID COMMAND
jun27 15-23:05:46 1379146  0.3 29660 postgres: argos_admin bdkairos
127.0.0.1(55605) SELECT
jun27 15-23:05:46 1379146  0.3 29659 postgres: argos_admin bdkairos
127.0.0.1(55604) SELECT
jun27 15-23:05:46 1379146  0.3 29658 postgres: argos_admin bdkairos
127.0.0.1(55603) SELECT
jun27 15-23:05:16 1379116  0.3 29909 postgres: argos_admin bdkairos
127.0.0.1(55767) SELECT
jun27 15-23:04:51 1379091  0.3 30097 postgres: argos_admin bdkairos
127.0.0.1(55909) SELECT
jun27 15-23:03:49 1379029  0.3 30661 postgres: argos_admin bdkairos
127.0.0.1(56303) SELECT
...
What should  i do to safely close that  queries, before to use kill
command in linux?

I have Postgres 9.4  running on debian jessie.

--

T

*o cancel a process with pg_terminate_backend, you need to be a superuser.*
*So first;*

*SET ROLE postgres;*

Then you should be able to
*SELECT **pg_terminate_backend(<pid>); *  Where <pid> is the pid of process
you want to termnate.
--
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

[Attachment of type text/html removed.]

------------------------------

End of [pgsql-general] Daily digest v1.15352 (7 messages)
**********

--- End Message ---