Are you even getting COPY to work with JDBC? As far as I am aware, COPY doesn't work with JDBC at the moment:
http://jdbc.postgresql.org/todo.html Listed in the todo page, under "PG Extensions" is "Add support for COPY." I tried to use it with JDBC a while ago and gave up after a couple limited experiments and reading that -- but didn't dig very deep into it.
As suggested, you should determine if you are disk bound or CPU bound. My experience with COPY is that it is suprisingly easy to make it CPU bound, but the conditions for that can vary quire a bit from schema to schema and hardware to hardware.
pg_bulkload may not be the tool for you for many reasons -- it requires a rigid data format and control file, very similar to Oracle's sqlloader. It may not fit your needs at all -- its just worth a look to see if it does since if there's a match, it will be much faster.
http://jdbc.postgresql.org/todo.html Listed in the todo page, under "PG Extensions" is "Add support for COPY." I tried to use it with JDBC a while ago and gave up after a couple limited experiments and reading that -- but didn't dig very deep into it.
As suggested, you should determine if you are disk bound or CPU bound. My experience with COPY is that it is suprisingly easy to make it CPU bound, but the conditions for that can vary quire a bit from schema to schema and hardware to hardware.
pg_bulkload may not be the tool for you for many reasons -- it requires a rigid data format and control file, very similar to Oracle's sqlloader. It may not fit your needs at all -- its just worth a look to see if it does since if there's a match, it will be much faster.
On Sun, Aug 31, 2008 at 6:49 PM, Craig Ringer <craig@xxxxxxxxxxxxxxxxxxxxx> wrote:
Thomas Finneid wrote:Just on a side note, your system is pretty strangely heavy on CPU
> Hi
>
> I am working on a table which stores up to 125K rows per second and I
> find that the inserts are a little bit slow. The insert is in reality a
> COPY of a chunk of rows, up to 125K. A COPY og 25K rows, without an
> index, is fast enough, about 150ms. With the index, the insert takes
> about 500ms. The read though, is lightning fast, because of the index.
> It takes only 10ms to retrieve 1000 rows from a 15M row table. As the
> table grows to several billion rows, that might change though.
>
> I would like the insert, with an index, to be a lot faster than 500ms,
> preferrably closer to 150ms. Any advice on what to do?
> Additionally, I dont enough about pg configuring to be sure I have
> included all the important directives and given them proportional
> values, so any help on that as well would be appreciated.
>
> Here are the details:
>
> postgres 8.2.7 on latest kubuntu, running on dual Opteron quad cores,
> with 8GB memory and 8 sata disks on a raid controller (no raid config)
compared to its RAM and disk configuration. Unless your workload in Pg
is computationally intensive or you have something else hosted on the
same machine, those CPUs will probably sit mostly idle.
The first thing you need to do is determine where, during your bulk
loads, the system is bottlenecked. I'd guess it's stuck waiting for disk
writes, personally, but I'd want to investigate anyway.
If you're not satisfied with the results from pg_bulkload you can look
into doing things like moving your indexes to separate tablespaces (so
they don't fight for I/O on the same disk sets as your tables),
separating your bulk load tables from other online/transactional tables,
etc.
Also, to relay common advice from this list:
If you land up considering hardware as a performance answer, getting a
decent SAS RAID controller with a battery backed cache (so you can
enable its write cache) and a set of fast SAS disks might be worth it.
For that matter, a good SATA RAID controller and some 10kRPM SATA disks
could help too. It all appears to depend a lot on the particular
workload and the characteristics of the controller, though.
--
Craig Ringer
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance