Search Postgresql Archives

Re: COPY v. java performance comparison

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Apr 2, 2014, at 1:14 PM, Rob Sargent <robjsargent@xxxxxxxxx> wrote:

> On 04/02/2014 01:56 PM, Steve Atkins wrote:
>> On Apr 2, 2014, at 12:37 PM, Rob Sargent <robjsargent@xxxxxxxxx>
>>  wrote:
>> 
>>> 
>>> Impatience got the better of me and I killed the second COPY.  This time it had done 54% of the file in 6.75 hours, extrapolating to roughly 12 hours to do the whole thing.
>>> 
>> That seems rather painfully slow. How exactly are you doing the bulk load? Are you CPU limited or disk limited?
>> 
>> Have you read 
>> http://www.postgresql.org/docs/current/interactive/populate.html
>>  ?
>> 
>> Cheers,
>>   Steve
>> 
>> 
> The copy command was pretty vanilla:
> copy oldstyle from '/export/home/rob/share/testload/<file-redacted>' with delimiter ' ';
> I've been to that page, but (as I read them) none sticks out as a sure thing.  I'm not so worried about the actual performance as I am with the relative throughput (sixes so far).
> 
> I'm not cpu bound, but I confess I didn't look at io stats during the copy runs. I just assume it was pegged :)

If each row is, say, 100 bytes including the per-row overhead (plausible for a uuid and a couple of strings), and you're inserting 800 rows a second, that's 80k/second, which would be fairly pathetic.

On my laptop (which has an SSD, sure, but it's still a laptop) I can insert 40M rows of data that has a few integers and a few small strings in about 52 seconds. And that's just using a simple, single-threaded load using psql to run copy from stdin, reading from the same disk as the DB is on, with no tuning of any parameters to speed up the load.

12 hours suggests there's something fairly badly wrong with what you're doing. I'd definitely look at the server logs, check system load and double check what you're actually running.

(Running the same thing on a tiny VM, one that shares a single RAID5 of 7200rpm drives with about 40 other VMs, takes a shade under two minutes, mostly CPU bound).

Cheers,
  Steve



-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux