Search Postgresql Archives

Re: INSERTing lots of data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Joachim Worringen wrote:
my Python application (http://perfbase.tigris.org) repeatedly needs to insert lots of data into an exsting, non-empty, potentially large table. Currently, the bottleneck is with the Python application, so I intend to multi-thread it. Each thread should work on a part of the input file.

You are wandering down a path followed by pgloader at one point: http://pgloader.projects.postgresql.org/#toc6 and one that I fought with briefly as well. Simple multi-threading can be of minimal help in scaling up insert performance here, due to the Python issues involved with the GIL. Maybe we get Dimitri to chime in here, he did more of this than I did.

Two thoughts. First, build a test performance case assuming it will fail to scale upwards, looking for problems. If you get lucky, great, but don't assume this will work--it's proven more difficult than is obvious in the past for others.

Second, if you do end up being throttled by the GIL, you can probably build a solution for Python 2.6/3.0 using the multiprocessing module for your use case: http://docs.python.org/library/multiprocessing.html

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@xxxxxxxxxxxxxxx   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux