On 02/21/2017 07:53 AM, Thomas Güttler wrote:
Am 21.02.2017 um 15:12 schrieb Adrian Klaver:
On 02/21/2017 12:53 AM, Thomas Güttler wrote:
I want to move table rows from one database to an central database.
You actually talking about moving from ~100 databases to the central
database, correct?
Both run PostgreSQL.
Are all the Postgres instances the same version and what is the
version or versions?
Yes, all run postgres, but the version can be different (but not much).
Satellite-DB 9.5 and 9.6 and central 9.6.
My use case looks like this:
There are N satellite databases in different data centers. N is about
100 at the moment.
There is one central database.
I need a way to reliably move rows from the satellite databases to the
central one
Two ways I can think of:
https://www.postgresql.org/docs/9.6/static/dblink.html
https://www.postgresql.org/docs/9.6/static/postgres-fdw.html
Since postgres_fdw is newer, I would focus on it, right?
If I understood it correctly, then there would be N (about 100) tables
in the central DB.
What happens if there is a network outage (for about 3 minutes) during
accessing a remote table?
I misunderstood your original intent, I thought this was a one time
process to move data to the central database. Given that it is to be a
continuous process a FDW may not be the answer, one of the reasons being
the above question. You will be denied the data in the remote table
during the outage. Also not sure what you will be doing with the data in
the central database and how often? In any case it will involve reaching
out to all the satellites each time you want to query the latest data.
Looks more like some kind of push mechanism from the satellites to the
central database is in order. Then once the data is on the central
database it is 'captured'. A question that comes to mind is if there is
a problem should the data transfer from one or more satellites lag that
of the others?
Is there a Primary Key on the satellite tables or some way of
determining unique rows?
The concrete schema is not specified up to now. But I guess UUID as
primary key would be the best fit.
Or am I wrong?
Is there any existing overlap between the data in the central database
and the satellite databases?
No, there won't be overlaps. Every satellite system creates its own rows.
How much data are you talking about moving from each database?
How active are the satellite databases?
100k rows per day per satellite. Each row has only few bytes.
Moving of rows should happen every ten minutes.
- inserts can happen during syncing.
Can UPDATEs happen?
No, rows get created and moved and later deleted.
Thank you Adrian for your questions. It helped me to narrow down my
problem.
Regards,
Thomas
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general