Re: Abnormal performance difference between Postgres and MySQL

Tom Lane <tgl@xxxxxxxxxxxxx> · Mon, 23 Feb 2009 21:18:59 -0500

Farhan Husain <russoue@xxxxxxxxx> writes:
> Here is the output:

I see a couple of things going on here:

* The planner is choosing to use sort-and-mergejoin for the second join.
This requires sorting all of jena_g1t1_stmt.  If it had accurately
estimated the output size of the first join (ie 30 rows not 30K rows)
it'd likely have gone for a nestloop join instead, assuming that you
have an index on jena_g1t1_stmt.subj.  You need to try to reduce the
1000X error factor in that estimate.  I'm not sure how much it will
help to increase the stats targets on the input tables, but that's
the first thing to try.

* Considering that the sort is all in memory, 5400 seconds seems like
a heck of a long time even for sorting 3M rows.  Since we already found
out you're using a non-C locale, the sort comparisons are ultimately
strcoll() calls, and I'm betting that you are on a platform where
strcoll() is really really slow.  Another possibility is that you don't
really have 500MB of memory to spare, and the sort workspace is getting
swapped in and out (read thrashed...).  Setting work_mem higher than
you can afford is a bad idea.

In comparison to mysql, I think that their planner will use a indexed
nestloop if it possibly can, which makes it look great on this type
of query (but not so hot if the query actually does need to join a
lot of rows).

			regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance