Search Postgresql Archives

Re: Disabling/Enabling index before bulk loading

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ravi Krishna <srkrishna@xxxxxxxxx> writes:

> We recently did a test on COPY and found that on large tables (47 million rows , 20GB of raw data) the 
> difference in COPY with 16 indexes and COPY without any index is 1:14. That is, COPY is 14 times slower 
> when data is ingested with all indexes as opposed to COPY first without index and then create all index.
>

This is very system dependent. On our system, when we tested a similar
approach, we found that the time saved through dropping the indexes
before copy was lost when rebuilding the indexes afterwards. In fact, it
ended up being slightly slower.

I suspect a lot depends on the number and types of indexes you
have. Your example had a lot more indexes than ours. We were loading 22
batches with 2.5M records per batch. While copy was significantly faster
than transaction based inserts (minutes vs hours), the differences between
indexes and no indexes was measured in minutes. We only had 3 or 4
indexes.   

> I googled for earlier posting on this and it looks like this has been asked before too.  
>
> This is what I am thinking to do:
>
> 1 - Extract index definition and save it as a SQL somewhere, either a file or a table.
> 2 - Drop all indexes.
> 3 - Ingest data via COPY
> 4 - Recreate all indexes saved in (1).
>
> Is there a generic sql or script or tool to accomplish (1).
>

We are loading data via Javascript using pg and pg-copy-streams modules. It is
pretty straight forward to drop the indexes and recreate them afterwards
via sql, so we didn't look for a tool as such.

As data is only inserted into this table and only by this process, we
also turned off autovacuum for this table, performing vacuum and analyze
manually after load. 

Tim

-- 
Tim Cross




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux