On Tue, May 10, 2011 at 7:35 PM, Craig Ringer <craig@xxxxxxxxxxxxxxxxxxxxx> wrote: > On 11/05/11 05:34, Aren Cambre wrote: > >> Using one thread, the app can do about 111 rows per second, and it's >> only exercising 1.5 of 8 CPU cores while doing this. 12,000,000 rows / >> 111 rows per second ~= 30 hours. > > I don't know how I missed that. You ARE maxing out one cpu core, so > you're quite right that you need more threads unless you can make your > single worker more efficient. > > Why not just spawn more copies of your program and have them work on > ranges of the data, though? Might that not be simpler than juggling > threading schemes? I suggested that earlier. But now I'm wondering if there's efficiencies to be gained by moving all the heavy lifting to the db as well as splitting thiings into multiple partitions to work on. I.e. don't grab 1,000 rows and work on them on the client side and then insert data, do the data mangling in the query in the database. My experience has been that moving things like this into the database can result in performance gains of several factors, taking hour long processes and making them run in minutes. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance