Tom Lane wrote: > Michael Stone <mstone+postgres@xxxxxxxxx> writes: >> OTOH, the planner can really screw up queries on really large databases. >> ... I've got some queries that the >> planner thinks will return on the order of 10^30 rows for that sort of >> reason. In practice, the query may return 10^3 rows.... > > Indeed, and if you've got examples where it's that far off, you should > report them. If I read this right, I've got quite a few cases where the planner expects 1 row but gets over 2000. And within the plan, it looks like there's a step where it expects 511 rows and gets 2390779 which seems to be off by a factor of 4600x. Also shown below it seems that if I use "OFFSET 0" as a "hint" I can force a much (10x) better plan. I wonder if there's room for a pgfoundry project for a patch set that lets us use more hints than OFFSET 0. Ron logs=# analyze; ANALYZE logs=# explain analyze select * from fact natural join d_ref natural join d_uag where ref_host = 'download.com.com' and ref_path = '/[path_removed].html' and useragent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=2827.72..398919.05 rows=1 width=242) (actual time=69175.963..141550.628 rows=2474 loops=1) Hash Cond: (fact.ref_id = d_ref.ref_id) -> Nested Loop (cost=2819.88..398908.65 rows=511 width=119) (actual time=3094.740..139361.235 rows=2390779 loops=1) -> Index Scan using i_uag__val on d_uag (cost=0.00..6.38 rows=1 width=91) (actual time=45.937..45.948 rows=1 loops=1) Index Cond: ((useragent)::text = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)'::text) -> Bitmap Heap Scan on fact (cost=2819.88..396449.49 rows=196223 width=32) (actual time=3048.770..135653.875 rows=2390779 loops=1) Recheck Cond: (fact.uag_id = d_uag.uag_id) -> Bitmap Index Scan on i__fact__uag_id (cost=0.00..2770.83 rows=196223 width=0) (actual time=1713.148..1713.148 rows=2390779 loops=1) Index Cond: (fact.uag_id = d_uag.uag_id) -> Hash (cost=7.83..7.83 rows=1 width=127) (actual time=62.841..62.841 rows=2 loops=1) -> Index Scan using i_ref__val on d_ref (cost=0.00..7.83 rows=1 width=127) (actual time=62.813..62.823 rows=2 loops=1) Index Cond: (((ref_path)::text = '[path_removed].html'::text) AND ((ref_host)::text = 'download.com.com'::text)) Total runtime: 141563.733 ms (13 rows) ############ using "offset 0" to force a better plan. logs=# explain analyze select * from fact natural join (select * from d_ref natural join d_uag where ref_host = 'download.com.com' and ref_path = '/[path_removed].html' and useragent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)' offset 0) as a; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=6465.12..7575.91 rows=367 width=2096) (actual time=2659.251..14703.343 rows=2474 loops=1) -> Limit (cost=0.00..14.22 rows=1 width=218) (actual time=114.968..115.140 rows=2 loops=1) -> Nested Loop (cost=0.00..14.22 rows=1 width=218) (actual time=114.964..115.127 rows=2 loops=1) -> Index Scan using i_ref__val on d_ref (cost=0.00..7.83 rows=1 width=127) (actual time=75.891..75.900 rows=2 loops=1) Index Cond: (((ref_path)::text = '[path_removed].html'::text) AND ((ref_host)::text = 'download.com.com'::text)) -> Index Scan using i_uag__val on d_uag (cost=0.00..6.38 rows=1 width=91) (actual time=19.582..19.597 rows=1 loops=2) Index Cond: ((useragent)::text = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)'::text) -> Bitmap Heap Scan on fact (cost=6465.12..7556.18 rows=367 width=32) (actual time=2240.090..7288.145 rows=1237 loops=2) Recheck Cond: ((fact.uag_id = a.uag_id) AND (fact.ref_id = a.ref_id)) -> BitmapAnd (cost=6465.12..6465.12 rows=367 width=0) (actual time=2221.539..2221.539 rows=0 loops=2) -> Bitmap Index Scan on i__fact__uag_id (cost=0.00..2770.83 rows=196223 width=0) (actual time=1633.032..1633.032 rows=2390779 loops=2) Index Cond: (fact.uag_id = a.uag_id) -> Bitmap Index Scan on i__fact__ref_id (cost=0.00..3581.50 rows=253913 width=0) (actual time=150.614..150.614 rows=77306 loops=2) Index Cond: (fact.ref_id = a.ref_id) Total runtime: 14710.870 ms (15 rows) ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match