Re: Tuning massive UPDATES and GROUP BY's?

runner <runner@xxxxxxxxxxx> · Mon, 14 Mar 2011 11:54:39 -0400

> Bulk data imports of this size I've done with minimal pain by simply 

> breaking the raw data into chunks (10M records becomes 10 files of 

> 1M records), on a separate spindle from the database, and performing 

> multiple COPY commands but no more than 1 COPY per server core.  

> I tested this a while back on a 4 core server and when I attempted 5 

> COPY's at a time the time to complete went up almost 30%.  I don't 

> recall any benefit having fewer than 4 in this case but the server was 

> only processing my data at the time.  Indexes were on the target table 

> however I dropped all constraints.  The UNIX split command is handy 

> for breaking the data up into individual files.

I'm not using COPY.  My dump file is a bunch if INSERT INTO statements.  I know it would be faster to use copy.  If I can figure out how to do this in one hour I will try it.  I did two mysqldumps, one with INSERT INTO and one as CSV to I can try COPY at a later time.   I'm running five parallel psql processes to import the data which has been broken out by table.