Re: Query-Planer from 6seconds TO DAYS

Böckler Andreas <andy@xxxxxxxxxxxx> · Wed, 24 Oct 2012 20:51:33 +0200

Hi Jeff,

thanks for your answer!

Am 24.10.2012 um 19:00 schrieb Jeff Janes:

> On Wed, Oct 24, 2012 at 8:41 AM, Böckler Andreas <andy@xxxxxxxxxxxx> wrote:
> 
>> SELECT  m.machine_id, s.timestamp, s.errorcode
>> FROM events m INNER JOIN spsdata as s ON (m.machine_id= s.machine_id
> 
> m.machine_id is equal to itself?  you must be retyping the query by hand…
Yes I did …  i changed the vars from german to english .. 
That should be m.machine_id=s.machine_id
> 
> You should report the results of "EXPLAIN ANALYZE" rather than merely
> EXPLAIN, as that would make it much easier to verify where the
> selectivity estimates are off.
> 
OK .. 
i can do that for the FAST query. 
But the other one would take days. (see below )

> 
>> FAST:
>>                                                                            QUERY PLAN
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Nested Loop  (cost=0.00..144979241.24 rows=42662 width=14)
>>   Join Filter: ((s."timestamp" <= m."timestamp") AND (m.machine_id = s.machine_id) AND (s."timestamp" >= (m."timestamp" - '00:00:30'::interval)))
>>   ->  Index Scan using events_code on events m  (cost=0.00..4911.18 rows=25 width=12)
>>         Index Cond: (code = 2024)
>>         Filter: (("timestamp" >= '2012-08-14 00:00:00'::timestamp without time zone) AND ("timestamp" <= '2012-08-26 00:00:00'::timestamp without time zone))
>>   ->  Append  (cost=0.00..5770958.44 rows=1400738 width=14)
>>         ->  Index Scan using spsdata_machine_id on spsdata s  (cost=0.00..4.11 rows=1 width=14)
>>               Index Cond: (s.machine_id = m.machine_id)
> 
> Was there more to the plan that you snipped?  If not, why isn't it
> checking all the other partitions?

Your right. It's checking all partitions!. So the constraint exclusion doesn't kick in.
This can be fixed with
SELECT  
	m.machine_id, s.timestamp, s.errorcode
FROM 
	events m 
	INNER JOIN spsdata as s ON (m.machine_id=s.machine_id AND s.timestamp BETWEEN m.timestamp - interval '30 seconds' AND m.timestamp)
WHERE 
	m.code IN (2024) 
	AND m.timestamp  BETWEEN '2012-08-01' AND '2012-08-29' 
	AND s.timestamp  BETWEEN '2012-08-01' AND '2012-08-29' 
	AND s.errorcode in ('2024');

It doesn't take hours to end, but it's not the performance gain you would expect.

I'v changed the query to one partition spsdata_2012m08  and attached the slow and fast cases with EXPLAIN ANALYZE.

The difference is one day in the WHERE-Clause
290.581 ms VS  687887.674 ms !
Thats 2372 times slower.

How can i force the fast query plan in a select?

At least I know that spsdata_2012m08 has way more records than events
spsdata_2012m08: reltuples -> 5.74082 * 10^7
events: count(1) for that time range -> 51383

> 
> If you can't fix the selectivity estimates, one thing you could do to
> drive it to the faster query is to decrease random_page_cost to be the
> same seq_page_cost.  That should push the cross-over point to the
> sequential scan out to a region you might not care about.  However, it
> could also drive other queries in your system to use worse plans than
> they currently are.
> Or, you could "set enable_seqscan = off" before running this
> particular query, then reset it afterwards.
> 
> Cheers,
> 
> Jeff
I've played with seq_page_cost and enable_seqscan already, but you have to know the right values before SELECT to get good results ;)

Cheers,

Andy

--  
Andreas Böckler
andy@xxxxxxxxxxxx
Attachment:
fast_explain_analyze.log

Description: Binary data
Attachment:
partition_selectfast.log

Description: Binary data
Attachment:
partition_selectfast.sql

Description: Binary data
Attachment:
partition_selectslow.log

Description: Binary data
Attachment:
partition_selectslow.sql

Description: Binary data
-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance