I've been working for the past few weeks on porting a closed source BitTorrent tracker to use PostgreSQL instead of MySQL for storing statistical data, but I've run in to a rather large snag. The tracker in question buffers its updates to the database, then makes them all at once, sending anywhere from 1-3 MiB of query data. With MySQL, this is accomplished using the INSERT INTO...ON DUPLICATE KEY UPDATE query, which seems to handle the insert/update very quickly; generally it only takes about a second for the entire set of new data to be merged. The problem I am encountering is that when I attempt to duplicate this functionality in Postgres, it is terrifically slow to a point of utter unusability. The tracker currently handles around 10,000-40,000 client updates per minute, which translates roughly to the same number of rows in the database. Part of the issue is that some of those rows cannot be updated because they do not yet exist in the database, but there is likely around a 100:1 ratio on updates to inserts. After consulting with some of the folks on the PostgreSQL IRC channel on freenode.net, I was left with this idea to try: --------------------------------------------------------------------- BEGIN CREATE TEMP TABLE temp_p2 ON COMMIT DROP AS (SELECT tid, uid, uploaded, downloaded, remaining, avg_up, avg_down, active, timespent, ip, port, peer_id, blocked FROM peers2 WHERE FALSE) COPY temp_p2 FROM STDIN WITH CSV QUOTE AS '''' < the data is sent by the tracker using PQputCopyData > UPDATE peers2 AS p SET uploaded = p.uploaded + t.uploaded, downloaded = p.downloaded + t.downloaded, remaining = t.remaining, avg_up = t.avg_up, avg_down = t.avg_down, active = t.active, timespent = p.timespent + t.timespent, ip = t.ip, port = t.port, blocked = t.blocked, timestamp = CURRENT_TIMESTAMP FROM temp_p2 AS t WHERE (p.uid = t.uid AND p.tid = t.tid) INSERT INTO peers2 (tid, uid, uploaded, downloaded, remaining, avg_up, avg_down, active, timespent, ip, port, peer_id, blocked) SELECT t.* FROM temp_p2 AS t LEFT JOIN peers2 USING (uid, tid) WHERE peers2.uid IS NULL AND peers2.tid IS NULL COMMIT --------------------------------------------------------------------- Initial attempts showed the UPDATE query was incredibly slow. After sitting down at the psql command line, I managed to get the query plan for it after much waiting. # EXPLAIN ANALYZE UPDATE peers2...etc etc QUERY PLAN ------------------------------------------------------------------------- Merge Join (cost=262518.76..271950.65 rows=14933 width=153) (actual time=8477.422..9216.893 rows=26917 loops=1) Merge Cond: ((p.tid = t.tid) AND (p.uid = t.uid)) -> Sort (cost=177898.12..180004.09 rows=842387 width=65) (actual time=7803.248..8073.817 rows=109732 loops=1) Sort Key: p.tid, p.uid -> Seq Scan on peers2 p (cost=0.00..25885.87 rows=842387 width=65) (actual time=0.043..4510.771 rows=647686 loops=1) -> Sort (cost=84620.64..85546.64 rows=370400 width=96) (actual time=641.438..761.893 rows=55393 loops=1) Sort Key: t.tid, t.uid -> Seq Scan on temp_p2 t (cost=0.00..11112.00 rows=370400 width=96) (actual time=0.093..275.110 rows=55393 loops=1) Total runtime: 192569.492 ms (9 rows) (Apologies if the formatting got ruined by my e-mail client.) Essentially, it looks like what it's doing is sorting both tables on the WHERE clause, then finding which positions correspond between the two. The problem is that, as can be seen, peers2 has 600,000+ rows, so sequential scanning and sorting it is a rather non-trivial operation. As a sidenote, there is a unique index set up for peers2.uid and peers2.tid, so any lookups should be fully indexed. After this method seemed to fail miserably, I took another approach and wrote a stored procedure, which should in theory accomplish much the same thing. I assumed this would be faster because it would iterate over the temp_p2 table sequentially, and do a simple index lookup + update to the peers2 table on each step, without any sorting or craziness required. For this to work, the tracker needs to automatically categorize client updates into "needs UPDATE" or "needs INSERT" buffers, which would be handled separately. The inserts are lightning quick and are not an issue, but the updates, as illustrated below, are not very good. Here is the first version of the stored procedure: CREATE OR REPLACE FUNCTION tracker_update() RETURNS integer AS $PROC$ DECLARE rec temp_p2%ROWTYPE; BEGIN FOR rec IN SELECT * FROM temp_p2 LOOP UPDATE peers2 SET uploaded = uploaded + rec.uploaded, downloaded = downloaded + rec.downloaded, remaining = rec.remaining, avg_up = rec.avg_up, avg_down = rec.avg_down, active = rec.active, timespent = timespent + rec.timespent, ip = rec.ip, port = rec.port, peer_id = rec.peer_id, blocked = rec.blocked, timestamp = CURRENT_TIMESTAMP WHERE uid = rec.uid AND tid = rec.tid; END LOOP; RETURN 1; END; $PROC$ LANGUAGE plpgsql; Here are the results. I ran it through EXPLAIN ANALYZE just to get the timing information, since one cannot view query plans for stored procedures that way. This is on the same dataset as the UPDATE above. # EXPLAIN ANALYZE SELECT tracker_update(); QUERY PLAN ------------------------------------------------------------------------- Result (cost=0.00..0.01 rows=1 width=0) (actual time=496676.127..496676.128 rows=1 loops=1) Total runtime: 496684.305 ms (2 rows) Realizing this took about 2.57 times as long as the original UPDATE query, I ventured to write a second version that used EXECUTE instead so there was no pre-planning and it would run the UPDATES individually with constant values. Here is that second version: CREATE OR REPLACE FUNCTION tracker_update2() RETURNS integer AS $PROC$ DECLARE rec temp_p2%ROWTYPE; BEGIN FOR rec IN SELECT * FROM temp_p2 LOOP EXECUTE 'UPDATE peers2 SET uploaded = uploaded + ' || rec.uploaded || ', downloaded = downloaded + ' || rec.downloaded || ', remaining = ' || rec.remaining || ', avg_up = ' || rec.avg_up || ', avg_down = ' || rec.avg_down || ', active = ' || rec.active || ', timespent = timespent + ' || rec.timespent || ', ip = ' || quote_literal(host(rec.ip)) || ', port = ' || rec.port || ', peer_id = ' || quote_literal(encode(rec.peer_id, 'escape')) || ', blocked = ' || rec.blocked || ', timestamp = CURRENT_TIMESTAMP WHERE' || ' uid = ' || rec.uid || ' AND tid = ' || rec.tid; END LOOP; RETURN 1; END; $PROC$ LANGUAGE plpgsql; As an aside, the quote_literal calls are needed for the ip and peer_id fields, which are inet and bytea types, respectively. Here is the execution time of this stored procedure on the same dataset as the UPDATE above: # EXPLAIN ANALYZE SELECT tracker_update2(); QUERY PLAN ------------------------------------------------------------------------ Result (cost=0.00..0.01 rows=1 width=0) (actual time=432153.101..432153.103 rows=1 loops=1) Total runtime: 432153.127 ms (2 rows) Hardly an improvement. Unfortunately at this time I'm at a bit of a loss. I cannot think of any solution that would be faster than the original UPDATE method, but this method is far, far too slow anyway, since the tracker attempts to send a new buffer to Postgres every minute, so if each buffer takes ~3 minutes to process, it will simply backlog itself into oblivion, which means I might as well not bother porting the code at all. I don't mean to come down on Postgres, but it surprises me that for all the advanced features it has, MySQL utterly blows it out of the water for this one aspect, but I'm absolutely willing to admit it's probably my fault for approaching the problem the wrong way, which is why I'm seeking help :) Anyway, I admit I haven't done a great deal of configuration file tuning for the Postgres setup, so if there is something obvious behind the poor performance, I'll be happy to test out any proposed changes and report back with the results. If anyone has *ANY* ideas as to why this is so slow, or different approaches that may be faster than what I have tried, please e-mail me back; I've struggled with this for week with absolutely no results. If you need to see configuration files, sample data, or anything else, don't hesitate to ask, I'll be happy to provide as much as I can if it means there's a solution on the horizon. For those of you that have actually read all of this, thank you for your patience, I know it's a huge message. Thanks much, Steve Gerhardt