Re: Query runs fast or slow

felix@xxxxxxxxxxx · Mon, 17 Apr 2006 13:52:40 -0700

On Sun, Apr 16, 2006 at 04:32:25PM -0400, Tom Lane wrote:
> felix-accts-pgsql@xxxxxxxxxxx writes:
> > EXPLAIN ANALYZE doesn't show the slow timing
> > because it requires values, not $n placeholders,
> 
> To analyze the plan used for a parameterized query, try
> 
> 	PREPARE foo(...) AS SELECT ... $n ...
> 
> 	EXPLAIN ANALYZE EXECUTE foo(...)
> 
> But I already know what you're going to find: the planner's estimates
> for the range query are not going to be very good when it has no idea
> what the range bounds are.  This is a situation where it may be best
> to absorb the hit of re-planning each time instead of using a generic
> parameterized plan.

OK, here is the new explain analyze.  I eliminated cache effects by
dumping the tables and picking random values with an editor.

felix=> PREPARE foo(TEXT, INT, INT) AS SELECT s.data, g.key, g.val, g.sid FROM key k, val_int v, sid s, glue_int g WHERE (k.data = $1 AND k.id = g.key) AND (v.data >= $2 AND v.data <= $3) AND v.id = g.val AND g.sid = s.id;
PREPARE
felix=> explain analyze execute foo('mthNQFrmVs3Q4bVruCxIAGy', 1973028023, 1973028223);
                                                                             QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1380.11..404223.36 rows=499 width=60) (actual time=5785.012..77823.688 rows=1 loops=1)
   ->  Hash Join  (cost=1380.11..402713.38 rows=499 width=16) (actual time=5766.308..77804.969 rows=1 loops=1)
         Hash Cond: ("outer".val = "inner".id)
         ->  Nested Loop  (cost=0.00..400829.78 rows=99701 width=16) (actual time=115.154..77401.159 rows=100000 loops=1)
               ->  Index Scan using key_data_key on "key" k  (cost=0.00..5.82 rows=1 width=4) (actual time=0.125..0.132 rows=1 loops=1)
                     Index Cond: (data = $1)
               ->  Index Scan using glue_int_key_idx on glue_int g  (cost=0.00..399577.70 rows=99701 width=16) (actual time=115.011..76570.366 rows=100000 loops=1)
                     Index Cond: ("outer".id = g."key")
         ->  Hash  (cost=1378.86..1378.86 rows=500 width=4) (actual time=11.580..11.580 rows=0 loops=1)
               ->  Index Scan using val_int_data_key on val_int v  (cost=0.00..1378.86 rows=500 width=4) (actual time=11.556..11.561 rows=1 loops=1)
                     Index Cond: ((data >= $2) AND (data <= $3))
   ->  Index Scan using sid_pkey on sid s  (cost=0.00..3.01 rows=1 width=52) (actual time=18.682..18.687 rows=1 loops=1)
         Index Cond: ("outer".sid = s.id)
 Total runtime: 77823.897 ms
(14 rows)

A repeat shows it faster, from 77 seconds to 3.

felix=> explain analyze execute foo('mthNQFrmVs3Q4bVruCxIAGy', 1973028023, 1973028223);
                                                                           QUERY PLAN                                                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1380.11..404223.36 rows=499 width=60) (actual time=205.137..2931.899 rows=1 loops=1)
   ->  Hash Join  (cost=1380.11..402713.38 rows=499 width=16) (actual time=205.056..2931.803 rows=1 loops=1)
         Hash Cond: ("outer".val = "inner".id)
         ->  Nested Loop  (cost=0.00..400829.78 rows=99701 width=16) (actual time=0.148..2564.255 rows=100000 loops=1)
               ->  Index Scan using key_data_key on "key" k  (cost=0.00..5.82 rows=1 width=4) (actual time=0.031..0.039 rows=1 loops=1)
                     Index Cond: (data = $1)
               ->  Index Scan using glue_int_key_idx on glue_int g  (cost=0.00..399577.70 rows=99701 width=16) (actual time=0.105..1808.068 rows=100000 loops=1)
                     Index Cond: ("outer".id = g."key")
         ->  Hash  (cost=1378.86..1378.86 rows=500 width=4) (actual time=0.090..0.090 rows=0 loops=1)
               ->  Index Scan using val_int_data_key on val_int v  (cost=0.00..1378.86 rows=500 width=4) (actual time=0.074..0.080 rows=1 loops=1)
                     Index Cond: ((data >= $2) AND (data <= $3))
   ->  Index Scan using sid_pkey on sid s  (cost=0.00..3.01 rows=1 width=52) (actual time=0.061..0.066 rows=1 loops=1)
         Index Cond: ("outer".sid = s.id)
 Total runtime: 2932.013 ms
(14 rows)

And running it as a simple query shows it much faster, 72 ms.

felix=> EXPLAIN ANALYZE SELECT s.data, g.key, g.val, g.sid FROM key k, val_int v, sid s, glue_int g WHERE (k.data = 'mthNQFrmVs3Q4bVruCxIAGy' AND k.id = g.key) AND (v.data >= 1973028023 AND v.data <= 1973028223) AND v.id = g.val AND g.sid = s.id;
                                                                        QUERY PLAN                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=5.82..1162.41 rows=1 width=60) (actual time=53.341..71.996 rows=1 loops=1)
   ->  Hash Join  (cost=5.82..1159.38 rows=1 width=16) (actual time=53.280..71.921 rows=1 loops=1)
         Hash Cond: ("outer"."key" = "inner".id)
         ->  Nested Loop  (cost=0.00..1148.54 rows=1001 width=16) (actual time=40.525..67.962 rows=1000 loops=1)
               ->  Index Scan using val_int_data_key on val_int v  (cost=0.00..6.01 rows=1 width=4) (actual time=0.068..0.072 rows=1 loops=1)
                     Index Cond: ((data >= 1973028023) AND (data <= 1973028223))
               ->  Index Scan using glue_int_val_idx on glue_int g  (cost=0.00..730.26 rows=32982 width=16) (actual time=40.441..60.218 rows=1000 loops=1)
                     Index Cond: ("outer".id = g.val)
         ->  Hash  (cost=5.82..5.82 rows=1 width=4) (actual time=0.067..0.067 rows=0 loops=1)
               ->  Index Scan using key_data_key on "key" k  (cost=0.00..5.82 rows=1 width=4) (actual time=0.049..0.054 rows=1 loops=1)
                     Index Cond: (data = 'mthNQFrmVs3Q4bVruCxIAGy'::text)
   ->  Index Scan using sid_pkey on sid s  (cost=0.00..3.01 rows=1 width=52) (actual time=0.048..0.052 rows=1 loops=1)
         Index Cond: ("outer".sid = s.id)
 Total runtime: 72.099 ms
(14 rows)

But the plan has also changed, from outer.val to outer.key.  Since the
key table is 100 times smaller than the val table, I can see why one
executes so much faster.

But is 77 seconds reasonable?  It just seems incredibly slow.

-- 
            ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
     Felix Finch: scarecrow repairman & rocket surgeon / felix@xxxxxxxxxxx
  GPG = E987 4493 C860 246C 3B1E  6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o