Search Postgresql Archives

Re: UPDATE on two large datasets is very slow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tom Lane wrote:

You're focusing on the wrong thing --- there's nothing wrong with the plan.
It's only taking 9 seconds to perform the merge join.  The other 183
seconds are going somewhere else; you need to find out where.

One thing that came to mind was triggers, which would be shown in the
EXPLAIN results if you are using a sufficiently recent version of PG
(but you didn't say what you're using) ... however if this is a straight
port of MySQL code it's pretty unlikely to have either custom triggers
or foreign keys, so that is most likely the wrong guess.  It may just be
that it takes that long to update 26917 rows, which would suggest a
configuration problem to me.

Any suggestions for finding out where all the time is being spent? I'm running 8.2.0 by the way, bit boneheaded of me to not mention that in the original message, but I'm planning on upgrading to 8.2.3 soon. I don't have any triggers or other procedures set up that would interrupt this, which is why I'm really confused as to the enormous runtime. I agree that the merge join only takes 9 seconds, but it looks to me like the 183 seconds are spent sequential scanning both tables, then sorting the results, which I imagine would be necessary for the merge join to take place. Sadly, I'm not familiar enough with the internals to know if this is the case or not.

shared_buffers, wal_buffers, and checkpoint_segments seem like things
you might need to increase.

I'll try modifying those and report back with what kind of performance increases I can get.

Another problem with this approach is that it's not going to take long
before the table is bloated beyond belief, if it's not vacuumed
regularly.  Do you have autovacuum turned on?

Does the tracker tend to send a lot of null updates (no real change to
the rows)?  If so it'd be worth complicating the query to check for
no-change and avoid the update for unchanged rows.

The tracker is set up to run a VACUUM ANALYZE after each commit; I neglected to mention that. From the testing I've done, it seems like performance is more or less the same whether the table has been vacuumed recently. Also, the tracker specifically ignores null updates where no data is changed to cut down on the size of the data being sent.

Also, if you don't mind answering, I've been pretty puzzled why the two stored procedures are substantially slower than the original method, since the "concept" in my head seems like they would be a lot more simple. Am I missing something huge with the way Postgres works?

			regards, tom lane

Thanks for the help, Tom.

Steve Gerhardt


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux