Re: Bulk DML performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2025-03-13 at 18:13 +0800, bill.poole@xxxxxxxxx wrote:
> 
> it is noteworthy that inserting 27 MB of data into a newly created table creates
> 191 MB of data including the index and 127 MB of data excluding the index.

PostgreSQL has a lot of overhead per row.

> 
> Can you help me understand why performing 3 million lookups on a b-tree index
> with all pages cached in memory takes so long?

It is probably not the lookup, but the *modification* of the index that is slow.

> 
> It seems like deleting 3 million rows identified by 3 million IDs should be
> faster than updating 3 million rows (also identified by 3 million IDs).

It should be, yes.
To find out where the time is spent, use EXPLAIN (ANALYZE, BUFFERS) on the
statement.

> > To get better performance, the best I can think of is to parallelize loading
> > the data until you saturate CPU, disk or hit internal contention in the database.
> 
> Sadly, I cannot do that because I need all rows to be inserted in a single
> database transaction, which I cannot do over multiple database connections.

Then the best you can do is to use COPY rather than INSERT.
It will perform better (but now vastly better).

Yours,
Laurenz Albe






[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux