Re: Best COPY Performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Luke Lonergan wrote:
Stefan,

On 10/30/06 8:57 AM, "Stefan Kaltenbrunner" <stefan@xxxxxxxxxxxxxxxx> wrote:

We've found that there is an ultimate bottleneck at about 12-14MB/s despite
having sequential write to disk speeds of 100s of MB/s.  I forget what the
latest bottleneck was.
I have personally managed to load a bit less then 400k/s (5 int columns
no indexes) - on very fast disk hardware - at that point postgresql is
completely CPU bottlenecked (2,6Ghz Opteron).

400,000 rows/s x 4 bytes/column x 5 columns/row = 8MB/s

Using multiple processes to load the data will help to scale up to about
  900k/s (4 processes on 4 cores).

yes I did that about half a year ago as part of the CREATE INDEX on a 1,8B row table thread on -hackers that resulted in some some the sorting improvements in 8.2. I don't think there is much more possible in terms of import speed by using more cores (at least not when importing to the same table) - iirc I was at nearly 700k/s with two cores and 850k/s with 3 cores or such ...


18MB/s?  Have you done this?  I've not seen this much of an improvement
before by using multiple COPY processes to the same table.

Another question: how to measure MB/s - based on the input text file?  On
the DBMS storage size?  We usually consider the input text file in the
calculation of COPY rate.


yeah that is a good questions (and part of the reason why I cited the rows/sec number btw.)


Stefan


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux