Re: Regarding Postgres - Insertion Time Getting Increased As Data Volume is getting increased

Daniele Varrazzo <daniele.varrazzo@xxxxxxxxx> · Wed, 10 Feb 2021 10:29:07 +0100

On Wed, 10 Feb 2021, 09:26 Rajnish Vishwakarma, <rajnish.nationfirst@xxxxxxxxx> wrote:
Hi Postgres Team,
The below are the scenarios which we are dealing with.

1) There are 20 Tables - On an average each having 150 columns.

2) There are 20 Threads Handled by Thread Pool Executor ( here we are using Python's - psycopg2 module / library to  fetch the data .)

3) I am using the below statement to insert the data using Python - psycopg2 module - using the exceute(...) command as .

sql_stmt = "INSERT INTO " + name_Table + final_col_string + "VALUES" + str(tuple(array_of_curly_values))
print('Sql statement', sql_stmt)col_cursor_db = db_conn.cursor()
v = col_cursor_db.execute(sql_stmt);

This is an insecure way to do it, but that's beside the point.

But earlier the same 22 threads were running and the insertion time was gradually increased from 1 second to 30-35 seconds.

Requesting and urging the postgres general support team to help me out on this.

How can i increase the INSERTION speed to minimize the insertion time taken by each thread in the THREAD POOL.

Using a COPY statement instead of insert. For a more moderate change in your code, but for a smaller increase of speed, you can look at the batching helpers (https://www.psycopg.org/docs/extras.html#fast-execution-helpers).

Or there any different python libraries other than psycopg2 ?

Psycopg3 hasn't been released yet, so using it is on the experimental side. However it provides a better support to using copy which would be perfect for your use case (https://www.psycopg.org/psycopg3/docs/copy.html#writing-data-row-by-row).

-- Daniele