Thomas: On Thu, Feb 23, 2017 at 4:16 PM, Thomas Güttler <guettliml@xxxxxxxxxxxxxxxxxx> wrote: > Am 22.02.2017 um 16:00 schrieb Adrian Klaver: >> only written on a successful transfer. To improve the chances of >> successful transfer more smaller transfer batches >> rather then larger transfers. > I really need a solid solution. > You said "... improve the chances of successful transfer ...". This makes me > nervous. I think what Adrian say is you improve the individual transfer time, if it fails you retry. > Delays are no problems, but data loss or duplication is. Remember you can never guarantee 'exactly once' without very complex solutions, I think you can do "at least once" or "at most once". That means lose or duplicate. That being said, IF you have some kind of global, unchanging ( at the central site ) unique key, you could try the following, using a holding table in each satellite and assuming you have 'on conflict do nothing'. 1.- Move rows from main to holding table in the satelite, in a single transaction. This is to let you work with an unmovable set ( as your process is the only one touching the holding tables ). If there is some data in holding it is no problem, they are from a previous crashed transfer. 2.- Insert every thing from the holding table in main, using on conflict do nothing. 3.- When everything is commited in main, truncate the satellite holding table. If satellite crashes in 1 it will roll back, you have not touched main. If you crash in 2 you will find 1 partially full in the next round, and main will be rolled back ( it's important to not commit until everything is done in 2, i.e., if you have read problems in the satellite do no go to 3, just crash and rollback everything ). You can either do a loop with the current set or append more data, your choice, does not matter, as you have to reinsert. The on conflict do nothing in 2 will take care of potential duplicates. If you crash in 3 you will transfer the lot again, but the do-nothing in 2 will eliminate it and 3 will eventually purge it. You can optimize on that, but basically you just repeat this until everything goes fine. I do these ( just with two DBs, not 100 ) and it works. It does a lot of duplicate work, but only on problems, it normally runs smooth. If you do not have "on conflict do nothing" ( I do not remember the versions ) you can use an extra step. Instead of inserting in main in 2 do 2.a - Copy holding to main ( truncating before hand if copy present ) and 2.b insert news from the copy, either by using and anti-join with main or by deleting ( in the same transaction ) the dupes before inserting. Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general