Search Postgresql Archives

Re: COPY FROM STDIN instead of INSERT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Oct 18, 2006, at 5:20 AM, Ilja Golshtein wrote:

When starting a database from scratch it is much faster to import the
data and then create the indexes. The time to create index on a full
table is less than the extra time from each index update from the
inserts. The more indexes to update the more time updating indexes takes.

The problem with a live database is removing the indexes slows down
current users and if you are adding 2,000 rows to a table that already
has 5,000,000 rows in it then you will loose the benefit.

I am 100% agree with you. What you are describing is a very good and useful technique for some maintenance operations.

My current goal is to increase performance in normal [almost ;)] OLTP mode of my application, so removing indexes for some time is not an option here.

And my question remains.
Is it Ok to use COPY BINARY FROM STDIN instead of multiple INSERTs?

I don't think I would use BINARY, it seems likely to be susceptible to changes in the underlying data type storage. From the docs:

"To determine the appropriate binary format for the actual tuple data you should consult the PostgreSQL source, in particular the *send and *recv functions for each column's data type (typically these functions are found in the src/backend/utils/adt/ directory of the source distribution)."

Regular text COPY is quite a big performance win over INSERTs, and doesn't require your application to know the vagaries of the data storage. Also, if you have many indices, time to update them will probably dominate anyhow, making the difference between binary and text copy negligible.

A was mentioned, COPY can only insert static data, and does not support rules (that you might use to support constraint exclusion, etc). AFIAK, the locking semantics are the same as INSERT, i.e., it does not lock the entire table or anything like that. Since it would allow transactions that insert data to finish faster, it should actually work better under high concurrency.

-Casey


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux