Re: replication issue

Alvaro Aguayo Garcia-Rada <aaguayo@xxxxxxxxxxxxxxx> · Wed, 4 Oct 2017 11:00:26 -0500 (PET)

Hi.

 Just restoring your non-BDR backup into a BDR cluster will not avoid future problems with sequences.

 At first glance, the problem you are having is that all your sequences are local sequences(the only sequence type without BDR). That works fine on an standalone, non-BDR node, but it could not work on a BDR cluster.

 If you create a sequence on a BDR cluster, that sequence will be created on all nodes. However, if you call nextval for that sequence on your 3 nodes, you will probably get the same values on each one. That's why BDR have a different type of sequences, "global sequences".

 BDR Global Sequences are split into chunks, and each group of chunks is assigned to each server in the cluster; nextval get the next value on a chunk assigned to the server you call it, guaranteeing that you will get different values on each server. It also guarantees that a second call to nextval(in the same server) will produce a value higher than the previous call. However, it does not guarantee time-sequence correlation across servers. More info about global sequences: http://bdr-project.org/docs/stable/global-sequences.html

 To solve your problem, you must use global sequences. There's no conversion process, so you must drop your current sequences and recreate them as global sequences. Something like this:

SELECT nextval('seq_nnnn'); -- Will give you a value, let's say "x"
DROP SEQUENCE seq_nnnn;
CREATE SEQUENCE seq_nnnn START x USING bdr; -- Replace "x" with 

Remember to assign USAGE permission as needed.

That will do the trick.

Saludos,

Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.

Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103  | RPC: (+51) 954183248
Website: www.ocs.pe

----- Original Message -----
From: "Keefer, Brad (ITS)" <Brad.Keefer@xxxxxxxxxx>
To: "pgsql-admin" <pgsql-admin@xxxxxxxxxxxxxx>
Sent: Wednesday, 4 October, 2017 10:28:20
Subject:  replication issue

So, I am migrating from a single Postgres DB node to a clustered setup.

I want to use BDR and run a multi-master cluster, so I have that configured between my 3 DB nodes.

I restored the DB backup from the single DB node on one of the BDR-connected nodes, and saw that it replicated the schema and tables ok, but I'm finding some weird behavior once I begin to use the cluster.

I have a user who is trying to log into the application that is using this Postgres DB as a backend, but he is unable to. Looking at the logs, it appears that when Postgres is trying to place his id in a table called auth_user, it is complaining that the key already exists. Looking at a table called auth_user_id_seq, I see that the information is different on each of the BDR nodes, and that makes me wonder why, since it should match up with what I restored.

This is what it looks like:

node01:
awx=# SELECT * FROM auth_user_id_seq;
  sequence_name   | last_value | start_value | increment_by |      max_value
  | min_value | cache_value | log_cnt | is_cycled | is_called | amdata
------------------+------------+-------------+--------------+-------------------
--+-----------+-------------+---------+-----------+-----------+--------
auth_user_id_seq |         50 |           1 |            1 | 922337203685477580
7 |         1 |           1 |       0 | f         | t         |
(1 row)

awx=#

node02:
awx=# SELECT * FROM auth_user_id_seq;
  sequence_name   | last_value | start_value | increment_by |      max_value
  | min_value | cache_value | log_cnt | is_cycled | is_called | amdata
------------------+------------+-------------+--------------+-------------------
--+-----------+-------------+---------+-----------+-----------+--------
auth_user_id_seq |         17 |           1 |            1 | 922337203685477580
7 |         1 |           1 |      30 | f         | t         |
(1 row)

awx=#

node03:
awx=# SELECT * FROM auth_user_id_seq;
  sequence_name   | last_value | start_value | increment_by |      max_value
  | min_value | cache_value | log_cnt | is_cycled | is_called | amdata
------------------+------------+-------------+--------------+-------------------
--+-----------+-------------+---------+-----------+-----------+--------
auth_user_id_seq |          4 |           1 |            1 | 922337203685477580
7 |         1 |           1 |      31 | f         | t         |
(1 row)

awx=#

I have to think that I missed a step somewhere during the restore process, or configuring the replication, that would lead to this... My expectation would be that those three entries would be the same.

We are also having other duplicate-key-related issues in the app, like creating a new template.

Any help you could offer would be greatly appreciated.

Thanks.

Brad Keefer
Red Hat Consultant, ITS, RHCSA

Office of Information Technology Services
W A Harriman State Campus, Bldg 8, Floor 7
(919)210-7045 | Brad.Keefer@xxxxxxxxxx<mailto:Brad.Keefer@xxxxxxxxxx>
www.its.ny.gov<http://www.its.ny.gov/>

-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin