On 11/25/20 11:18 AM, Tom Lane wrote:
Mats Julian Olsen <mats@xxxxxxxxxxxxxxxxx> writes:
I've got some more numbers here:
...
To me this does indicate some sort of networking issue, but I'm
wondering if INSERTs are treated differently than SELECTs in
postgres_fdw? The only feasibly explanation I have is that postgres_fdw
does many more network calls for INSERT than for SELECT, e.g. something
like 1 for SELECT and `n` for INSERT?
I don't have the code in front of me, but from memory, postgres_fdw
will issue an INSERT statement to the remote for each row it has to
insert. Maybe you are indeed just dealing with spectacularly bad
network round trip times.
You could try turning on log_statement and/or log_duration on the
remote to see if that sheds any more light about how much time is
spent executing each insertion vs. the network delay.
There's been some recent discussion about teaching postgres_fdw to
batch insertions, which would likely be helpful in your situation.
I don't know how close that is to committable, but in any case
it couldn't see the light of day earlier than v14. In the meantime,
if you're sufficiently desperate maybe you could switch to using
dblink with manually-issued multi-row INSERTs. (This needn't
preclude continuing to access the table with postgres_fdw when
that's suitable.)
So even if Mats where to break this query:
INSERT INTO foreign.labels (address, labels)
SELECT address_id, ARRAY_AGG(name) AS labels
FROM labels
GROUP BY 1
LIMIT 100;
down into something like this:
INSERT INTO foreign.labels (address, labels)
VALUES (), (), (), ();
postgres_fdw would send it as individual INSERTs?
regards, tom lane
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx