On Fri, 27 Mar 2009, Віталій Тимчишин wrote:
...an index on (objectid, start) would help...
Definitely.
You could try adding "AND l2.start > l1.start" to the first query.
This will drop symmetric half of intersections (the ones that will
remain are l2 inside or to the left of l1), but you can redo results by
id1,id2 union all id2, id1 and may allow to use start index for
"between"
That's remarkably clever, and should have been obvious. Thanks.
Adding the constraint as you say does indeed make the query fast. However,
there seems to be a problem with the planner, in that it does not
distinguish between the costs of two alternative plans, which have vastly
different real costs. Consider the following:
SELECT
l1.id AS id1,
l2.id AS id2
FROM
location l1,
location l2
WHERE
l1.objectid = 228000093
AND l2.objectid = 228000093
AND l1.id <> l2.id
AND l1.start < l2.end
AND l1.end > l2.start
AND l1.start < l2.start
UNION ALL
SELECT
l1.id AS id1, l2.id AS id2
FROM
location l1,
location l2
WHERE
l1.objectid = 228000093
AND l2.objectid = 228000093
AND l1.id <> l2.id
AND l1.start < l2.end
AND l1.end > l2.start
AND l1.start >= l2.start;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..20479179.74 rows=138732882 width=8)
-> Nested Loop (cost=0.00..9545925.46 rows=69366441 width=8)
Join Filter: ((l1.id <> l2.id) AND (l1.start < l2.end))
-> Index Scan using location_test_obj_end on location l1 (cost=0.00..55966.07 rows=43277 width=12)
Index Cond: (objectid = 228000093)
-> Index Scan using location_test_obj_start on location l2 (cost=0.00..123.10 rows=4809 width=12)
Index Cond: ((l2.objectid = 228000093) AND (l1.end > l2.start) AND (l1.start < l2.start))
-> Nested Loop (cost=0.00..9545925.46 rows=69366441 width=8)
Join Filter: ((l1.id <> l2.id) AND (l1.start < l2.end))
-> Index Scan using location_test_obj_end on location l1 (cost=0.00..55966.07 rows=43277 width=12)
Index Cond: (objectid = 228000093)
-> Index Scan using location_test_obj_start on location l2 (cost=0.00..123.10 rows=4809 width=12)
Index Cond: ((l2.objectid = 228000093) AND (l1.end > l2.start) AND (l1.start >= l2.start))
(13 rows)
Notice the two different index conditions:
(l1.end > l2.start) AND (l1.start < l2.start) - "between"
(l1.end > l2.start) AND (l1.start >= l2.start) - open-ended
Both have a cost of (cost=0.00..123.10 rows=4809 width=12)
Postgres estimates these two index scans to be equivalent in cost, where
they are actually vastly different in real cost. Shouldn't Postgres favour
a "between" index scan over an open-ended one?
Matthew
--
[About NP-completeness] These are the problems that make efficient use of
the Fairy Godmother. -- Computer Science Lecturer
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance