Search Postgresql Archives

Re: Possibilities for optimizing inserts across oracle_fdw foreign data wrapper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 9/19/21 06:28, Niels Jespersen wrote:

Hello all

 

We are often using the oracle_fdw to transfer data between Postgres (version 11+) and Oracle (version 18+). It works great.

 

However I have a task at hand that requires inserting a few billion rows in an Oracle table from a Postgres query.

 

insert into t_ora (a,b,c)   

select a,b,c from t_pg;

 

This is driven from a plpgsql stored procedure, if that matters.

 

I want to optimize the running time of this. But I am unsure of which, if any, possibilities there actually is.

 

Reducing the number of network roundtrips is usually a good way to increase throughput. But, how do I do that?

 

If I could make the Oracle insert direct load, that would usually also increase throughput. But, is that possible here. There are no constraints defined on the destinaton tables.

 

Regards Niels Jespersen


The problem with oracle_fdw is that the SQL is parsed on the Postgres side, not on the Oracle side. If it was parsed on the Oracle side, you could use /*+ APPEND */ hint, which is essentially, a direct insert. You will have to write a script in one of the scripting languages, which would utilize the array insert, available with the instant client. Even Oracle ODBC driver utilizes array insert, as visible from the following article:

https://dbwhisperer.wordpress.com/2020/11/21/pyodbc-fast_executemany-and-oracle-rdbms/

Unfortunately, the Postgres side of the equation is not particularly good when using array fetch and does not do particularly well when trying to cut down on the number of network trips:

https://github.com/mkleehammer/pyodbc/wiki/Driver-support-for-fast_executemany

I would use a script on the Postgres side and then use superior options provided by SQL*Net.  You will need some fancy programming to prevent waiting on each operation. I would actually write 2 scripts, one reading data from Postgres, converting it to CSV and then piping it into script that inserts data into Oracle. That would make the scripts work in parallel, at least partially. Situations like this are the reason why a DBA needs to know how to script. So, this is where you start:

https://python.swaroopch.com/

Regards

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux