Re: Performance trouble finding records through related records

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/02/2011 06:12 PM, sverhagen wrote:
Thanks for your help already!
Hope you're up for some more :-)


Andy Colson wrote:

First off, excellent detail.

Second, your explain analyze was hard to read... but since you are not
really interested in your posted query, I wont worry about looking at
it... but... have you seen:

http://explain.depesz.com/


Thanks for that. Using it below :-)


Andy Colson wrote:

If you run the individual queries, without the union, are the part's slow
too?


Only problem is the second part. So that part can safely be isolated. Also
the following does not play a role at this point: WHERE events2.eventtype_id
IN
(100,103,105,...

Then I went ahead and denormalized the transactionId on both ends, so that
both events_events records and events_eventdetails records have the
transactionId (or NULL). That simplifies the query to this:

	SELECT events_events.* FROM events_events WHERE transactionid IN (
		SELECT transactionid FROM events_eventdetails customerDetails
		WHERE customerDetails.keyname='customer_id'
		AND substring(customerDetails.value,0,32)='1957'
		AND transactionid IS NOT NULL
	) ORDER BY id LIMIT 50;

To no avail. Also changing the above WHERE IN into implicit/explicit JOIN's
doesn't make more than a marginal difference. Should joining not be very
efficient somehow?

http://explain.depesz.com/s/Pnb

The above link nicely shows the hotspots, but I am at a loss now as how to
approach them.


Andy Colson wrote:

Looked like your row counts (the estimate vs the actual) were way off,
have you analyzed lately?


Note sure what that means.
Isn't all the maintenance nicely automated through my config?



In the explain analyze you'll see stuff like:
Append  (cost=0.00..3256444445.93 rows=115469434145 width=52) (actual time=0.304..58763.738 rows=222 loops=1)

This is taken from your first email.  Red flags should go off when the row counts are not close.  The first set is the planner's guess.  The second set is what actually happened.  The planner thought there would be 115,469,434,145 rows.. but turned out to only be 222.  That's usually caused by bad stats.

Isn't all the maintenance nicely automated through my config?


I'd never assume.  But the numbers in the plan you posted:

http://explain.depesz.com/s/Pnb

look fine to me (well, the row counts), and I didnt look to much at that plan in the first email, so we can probably ignore it.


Andy Colson wrote:

I could not tell from the explain analyze if an index was used, but I
notice you have a ton of indexes on events_events table.


Yes, a ton of indexes, but still not the right one :-)

But... many indexes will slow down update/inserts.  And an index on an unselective field can cause more problems than it would help.  Especially if the stats are off.  If PG has lots and lots of options, it'll take longer to plan querys too.  If it picks an index to use, that it thinks is selective, but in reality is not, you are in for a world of hurt.

For your query, I think a join would be the best bet, can we see its explain analyze?

-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux