Tom Lane wrote:
You're focusing on the wrong thing --- there's nothing wrong with the plan. It's only taking 9 seconds to perform the merge join. The other 183 seconds are going somewhere else; you need to find out where. One thing that came to mind was triggers, which would be shown in the EXPLAIN results if you are using a sufficiently recent version of PG (but you didn't say what you're using) ... however if this is a straight port of MySQL code it's pretty unlikely to have either custom triggers or foreign keys, so that is most likely the wrong guess. It may just be that it takes that long to update 26917 rows, which would suggest a configuration problem to me.
Any suggestions for finding out where all the time is being spent? I'm running 8.2.0 by the way, bit boneheaded of me to not mention that in the original message, but I'm planning on upgrading to 8.2.3 soon. I don't have any triggers or other procedures set up that would interrupt this, which is why I'm really confused as to the enormous runtime. I agree that the merge join only takes 9 seconds, but it looks to me like the 183 seconds are spent sequential scanning both tables, then sorting the results, which I imagine would be necessary for the merge join to take place. Sadly, I'm not familiar enough with the internals to know if this is the case or not.
shared_buffers, wal_buffers, and checkpoint_segments seem like things you might need to increase.
I'll try modifying those and report back with what kind of performance increases I can get.
Another problem with this approach is that it's not going to take long before the table is bloated beyond belief, if it's not vacuumed regularly. Do you have autovacuum turned on? Does the tracker tend to send a lot of null updates (no real change to the rows)? If so it'd be worth complicating the query to check for no-change and avoid the update for unchanged rows.
The tracker is set up to run a VACUUM ANALYZE after each commit; I neglected to mention that. From the testing I've done, it seems like performance is more or less the same whether the table has been vacuumed recently. Also, the tracker specifically ignores null updates where no data is changed to cut down on the size of the data being sent.
Also, if you don't mind answering, I've been pretty puzzled why the two stored procedures are substantially slower than the original method, since the "concept" in my head seems like they would be a lot more simple. Am I missing something huge with the way Postgres works?
regards, tom lane
Thanks for the help, Tom. Steve Gerhardt