On Sun, Apr 16, 2006 at 04:32:25PM -0400, Tom Lane wrote: > felix-accts-pgsql@xxxxxxxxxxx writes: > > EXPLAIN ANALYZE doesn't show the slow timing > > because it requires values, not $n placeholders, > > To analyze the plan used for a parameterized query, try > > PREPARE foo(...) AS SELECT ... $n ... > > EXPLAIN ANALYZE EXECUTE foo(...) > > But I already know what you're going to find: the planner's estimates > for the range query are not going to be very good when it has no idea > what the range bounds are. This is a situation where it may be best > to absorb the hit of re-planning each time instead of using a generic > parameterized plan. OK, here is the new explain analyze. I eliminated cache effects by dumping the tables and picking random values with an editor. felix=> PREPARE foo(TEXT, INT, INT) AS SELECT s.data, g.key, g.val, g.sid FROM key k, val_int v, sid s, glue_int g WHERE (k.data = $1 AND k.id = g.key) AND (v.data >= $2 AND v.data <= $3) AND v.id = g.val AND g.sid = s.id; PREPARE felix=> explain analyze execute foo('mthNQFrmVs3Q4bVruCxIAGy', 1973028023, 1973028223); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=1380.11..404223.36 rows=499 width=60) (actual time=5785.012..77823.688 rows=1 loops=1) -> Hash Join (cost=1380.11..402713.38 rows=499 width=16) (actual time=5766.308..77804.969 rows=1 loops=1) Hash Cond: ("outer".val = "inner".id) -> Nested Loop (cost=0.00..400829.78 rows=99701 width=16) (actual time=115.154..77401.159 rows=100000 loops=1) -> Index Scan using key_data_key on "key" k (cost=0.00..5.82 rows=1 width=4) (actual time=0.125..0.132 rows=1 loops=1) Index Cond: (data = $1) -> Index Scan using glue_int_key_idx on glue_int g (cost=0.00..399577.70 rows=99701 width=16) (actual time=115.011..76570.366 rows=100000 loops=1) Index Cond: ("outer".id = g."key") -> Hash (cost=1378.86..1378.86 rows=500 width=4) (actual time=11.580..11.580 rows=0 loops=1) -> Index Scan using val_int_data_key on val_int v (cost=0.00..1378.86 rows=500 width=4) (actual time=11.556..11.561 rows=1 loops=1) Index Cond: ((data >= $2) AND (data <= $3)) -> Index Scan using sid_pkey on sid s (cost=0.00..3.01 rows=1 width=52) (actual time=18.682..18.687 rows=1 loops=1) Index Cond: ("outer".sid = s.id) Total runtime: 77823.897 ms (14 rows) A repeat shows it faster, from 77 seconds to 3. felix=> explain analyze execute foo('mthNQFrmVs3Q4bVruCxIAGy', 1973028023, 1973028223); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=1380.11..404223.36 rows=499 width=60) (actual time=205.137..2931.899 rows=1 loops=1) -> Hash Join (cost=1380.11..402713.38 rows=499 width=16) (actual time=205.056..2931.803 rows=1 loops=1) Hash Cond: ("outer".val = "inner".id) -> Nested Loop (cost=0.00..400829.78 rows=99701 width=16) (actual time=0.148..2564.255 rows=100000 loops=1) -> Index Scan using key_data_key on "key" k (cost=0.00..5.82 rows=1 width=4) (actual time=0.031..0.039 rows=1 loops=1) Index Cond: (data = $1) -> Index Scan using glue_int_key_idx on glue_int g (cost=0.00..399577.70 rows=99701 width=16) (actual time=0.105..1808.068 rows=100000 loops=1) Index Cond: ("outer".id = g."key") -> Hash (cost=1378.86..1378.86 rows=500 width=4) (actual time=0.090..0.090 rows=0 loops=1) -> Index Scan using val_int_data_key on val_int v (cost=0.00..1378.86 rows=500 width=4) (actual time=0.074..0.080 rows=1 loops=1) Index Cond: ((data >= $2) AND (data <= $3)) -> Index Scan using sid_pkey on sid s (cost=0.00..3.01 rows=1 width=52) (actual time=0.061..0.066 rows=1 loops=1) Index Cond: ("outer".sid = s.id) Total runtime: 2932.013 ms (14 rows) And running it as a simple query shows it much faster, 72 ms. felix=> EXPLAIN ANALYZE SELECT s.data, g.key, g.val, g.sid FROM key k, val_int v, sid s, glue_int g WHERE (k.data = 'mthNQFrmVs3Q4bVruCxIAGy' AND k.id = g.key) AND (v.data >= 1973028023 AND v.data <= 1973028223) AND v.id = g.val AND g.sid = s.id; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=5.82..1162.41 rows=1 width=60) (actual time=53.341..71.996 rows=1 loops=1) -> Hash Join (cost=5.82..1159.38 rows=1 width=16) (actual time=53.280..71.921 rows=1 loops=1) Hash Cond: ("outer"."key" = "inner".id) -> Nested Loop (cost=0.00..1148.54 rows=1001 width=16) (actual time=40.525..67.962 rows=1000 loops=1) -> Index Scan using val_int_data_key on val_int v (cost=0.00..6.01 rows=1 width=4) (actual time=0.068..0.072 rows=1 loops=1) Index Cond: ((data >= 1973028023) AND (data <= 1973028223)) -> Index Scan using glue_int_val_idx on glue_int g (cost=0.00..730.26 rows=32982 width=16) (actual time=40.441..60.218 rows=1000 loops=1) Index Cond: ("outer".id = g.val) -> Hash (cost=5.82..5.82 rows=1 width=4) (actual time=0.067..0.067 rows=0 loops=1) -> Index Scan using key_data_key on "key" k (cost=0.00..5.82 rows=1 width=4) (actual time=0.049..0.054 rows=1 loops=1) Index Cond: (data = 'mthNQFrmVs3Q4bVruCxIAGy'::text) -> Index Scan using sid_pkey on sid s (cost=0.00..3.01 rows=1 width=52) (actual time=0.048..0.052 rows=1 loops=1) Index Cond: ("outer".sid = s.id) Total runtime: 72.099 ms (14 rows) But the plan has also changed, from outer.val to outer.key. Since the key table is 100 times smaller than the val table, I can see why one executes so much faster. But is 77 seconds reasonable? It just seems incredibly slow. -- ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._. Felix Finch: scarecrow repairman & rocket surgeon / felix@xxxxxxxxxxx GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933 I've found a solution to Fermat's Last Theorem but I see I've run out of room o