Re: INSERTing lots of data

Joachim Worringen <joachim.worringen@xxxxxxxx> · Fri, 28 May 2010 15:17:33 +0200

On 05/28/2010 02:55 PM, Craig Ringer wrote:
On 28/05/10 17:41, Joachim Worringen wrote:
Greetings,

my Python application (http://perfbase.tigris.org) repeatedly needs to
insert lots of data into an exsting, non-empty, potentially large table.
Currently, the bottleneck is with the Python application, so I intend to
multi-thread it.

That may not be a great idea. For why, search for "Global Interpreter
Lock" (GIL).

It might help if Python's mostly blocked on network I/O, as the GIL is
released when Python blocks on the network, but still, your results may
not be great.

I verified that the thread actually execute queries concurrently. That 
does imply that they are blocked on I/O while the query is running, and 
that the query performance does in fact scale for this reason.

In the "import data" case, however, I really need concurrent processing 
on the CPU in the first place, so you may be right on this one. I'll 
check it.

will I get a speedup? Or will table-locking serialize things on the
server side?

Concurrent inserts work *great* with PostgreSQL, it's Python I'd be
worried about.

That's the part of answer I wanted to hear.,,

 thanks, Joachim

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general