Yang Zhang <yanghatespam@xxxxxxxxx> writes: > You're right, we're only sequentially issuing (unprepared) UPDATEs. You definitely want to fix both parts of that, then. > If we ship many UPDATE statements per call to our DB API's execution > function (we're using Python's psycopg2 if that matters, but I think > that just binds libpq), would that avoid the network round trip per > statement? Possibly, not sure how psycopg2 handles that. > If not, what if we use anonymous procedures (DO) to run multiple > UPDATE statements? I don't think an anonymous procedure as such would result in any plan caching, at least not unless you could write it to have a single UPDATE in a loop. > Finally, we could use the technique highlighted in my third bullet and > use COPY (or at least multiple-value INSERT), then merging the new > data with the old. Would that be the most direct route to maximum > performance? It might help, you'd need to try it. > In any case, I assume deleting and rebuilding indexes is important > here, yes? But what about raising checkpoint_segments - does this > actually help sustained throughput? If you're updating as much as 50% of the table, and you don't need the indexes for other purposes meanwhile, dropping and rebuilding them would be worth trying. Also, you definitely want checkpoint_segments large enough so that checkpoints are at least a few minutes apart. Excess checkpoints do represent a sustained drag on performance because they mean a greater volume of disk writes. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general