Michael Hull <mikehulluk@xxxxxxxxxxxxxx> writes: > I am fairly new to practical databases, but I am trying out the c > interface to postgres and am wondering how to improve performance. I > am a researcher, and I am trying to perform a large parameter sweep. > Since this will involve a couple of thousand simulations, I have a > process that manages which simulations have been done, and which still > need to be done, so I can run it easily on a cluster. There's basically no way for the performance of that loop to not suck. You're incurring two round trips to the server per jobid change, plus query parse/plan times; so even though the underlying table manipulations are relatively simple, there's just too much overhead. One way to fix this problem is to push the procedural logic into a stored procedure that runs on the server, so that your app sends one query, gets back one rowset of its newly assigned jobs, and all the database update side-effects are done by the procedure. That's only a small conceptual leap from where you are, but does require getting up to speed on plpgsql or one of the other procedural languages. The more radical way to fix it is to write the whole thing as one SQL command. This requires thinking about your problem as an operation on a set of rows, rather than an iteration, so it can be a pretty big conceptual jump for database novices. I'm not sure it's actually possible to do it given the specific table organization you've adopted --- if you need to both delete rows in unassignedjobs and insert rows in assignedjobs, there's no way to do it in one SQL operation. But if you're not yet wedded to that representation, you should consider having just one table and implementing the state change as an update to a status column instead of moving the data to a different table. Then you could probably implement the operation in a single UPDATE ... RETURNING command. Also, this looks suspiciously like a priority queue, which means you're basically reinventing the wheel. People have sweated the details on this type of thing before, and come up with very clever solutions that perform well even with multiple clients concurrently trying to obtain job assignments. Simple solutions tend to slow way down or even deadlock under those conditions :-(. Check the Postgres list archives. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general