Strange query plan invloving a view

Rich Doughty <rich@xxxxxxxxxx> · Thu, 17 Nov 2005 01:06:42 +0000

i have the following query involving a view that i really need to optimise:

SELECT *
FROM
	tokens.ta_tokenhist h INNER JOIN
	tokens.vw_tokens    t ON h.token_id = t.token_id
WHERE
	h.sarreport_id = 9
;

where vw_tokens is defined as

CREATE VIEW tokens.vw_tokens AS SELECT
	-- too many columns to mention
FROM
	tokens.ta_tokens      t LEFT JOIN
	tokens.ta_tokenhist   i ON t.token_id = i.token_id AND
                                   i.status   = 'issued'   LEFT JOIN
	tokens.ta_tokenhist   s ON t.token_id = s.token_id AND
                                   s.status   = 'sold'     LEFT JOIN
	tokens.ta_tokenhist   r ON t.token_id = r.token_id AND
                                   r.status   = 'redeemed'
;

this gives me the following query plan:

Merge Join  (cost=18276278.45..31793043.16 rows=55727 width=322)
   Merge Cond: (("outer".token_id)::integer = "inner"."?column23?")
   ->  Merge Left Join  (cost=18043163.64..31639175.71 rows=4228018 width=76)
         Merge Cond: (("outer".token_id)::integer = "inner"."?column3?")
         ->  Merge Left Join  (cost=13649584.94..27194793.37 rows=4228018 width=48)
               Merge Cond: (("outer".token_id)::integer = "inner"."?column3?")
               ->  Merge Left Join  (cost=7179372.62..20653326.29 rows=4228018 width=44)
                     Merge Cond: (("outer".token_id)::integer = "inner"."?column3?")
                     ->  Index Scan using ta_tokens_pkey on ta_tokens t  (cost=0.00..13400398.89 rows=4053805 width=27)
                     ->  Sort  (cost=7179372.62..7189942.67 rows=4228018 width=21)
                           Sort Key: (i.token_id)::integer
                           ->  Index Scan using fkx_tokenhist__status on ta_tokenhist i  (cost=0.00..6315961.47 rows=4228018 width=21)
                                 Index Cond: ((status)::text = 'issued'::text)
               ->  Sort  (cost=6470212.32..6479909.69 rows=3878949 width=8)
                     Sort Key: (s.token_id)::integer
                     ->  Index Scan using fkx_tokenhist__status on ta_tokenhist s  (cost=0.00..5794509.99 rows=3878949 width=8)
                           Index Cond: ((status)::text = 'sold'::text)
         ->  Sort  (cost=4393578.70..4400008.00 rows=2571718 width=32)
               Sort Key: (r.token_id)::integer
               ->  Index Scan using fkx_tokenhist__status on ta_tokenhist r  (cost=0.00..3841724.02 rows=2571718 width=32)
                     Index Cond: ((status)::text = 'redeemed'::text)
   ->  Sort  (cost=233114.81..233248.38 rows=53430 width=246)
         Sort Key: (h.token_id)::integer
         ->  Index Scan using fkx_tokenhist__sarreports on ta_tokenhist h  (cost=0.00..213909.12 rows=53430 width=246)
               Index Cond: ((sarreport_id)::integer = 9)

However, the following query (which i believe should be equivalent)

SELECT *
FROM
	tokens.ta_tokenhist   h INNER JOIN
	tokens.ta_tokens      t ON h.token_id = t.token_id LEFT JOIN
	tokens.ta_tokenhist   i ON t.token_id = i.token_id AND
                                   i.status   = 'issued'   LEFT JOIN
	tokens.ta_tokenhist   s ON t.token_id = s.token_id AND
                                   s.status   = 'sold'     LEFT JOIN
	tokens.ta_tokenhist   r ON t.token_id = r.token_id AND
                                   r.status   = 'redeemed'
WHERE
	h.sarreport_id = 9
;

gives the following query plan:

 Nested Loop Left Join  (cost=0.00..3475785.52 rows=55727 width=1011)
   ->  Nested Loop Left Join  (cost=0.00..2474425.17 rows=55727 width=765)
         ->  Nested Loop Left Join  (cost=0.00..1472368.23 rows=55727 width=519)
               ->  Nested Loop  (cost=0.00..511614.87 rows=53430 width=273)
                     ->  Index Scan using fkx_tokenhist__sarreports on ta_tokenhist h  (cost=0.00..213909.12 rows=53430 width=246)
                           Index Cond: ((sarreport_id)::integer = 9)
                     ->  Index Scan using ta_tokens_pkey on ta_tokens t  (cost=0.00..5.56 rows=1 width=27)
                           Index Cond: (("outer".token_id)::integer = (t.token_id)::integer)
               ->  Index Scan using fkx_tokenhist__tokens on ta_tokenhist i  (cost=0.00..17.96 rows=2 width=246)
                     Index Cond: (("outer".token_id)::integer = (i.token_id)::integer)
                     Filter: ((status)::text = 'issued'::text)
         ->  Index Scan using fkx_tokenhist__tokens on ta_tokenhist s  (cost=0.00..17.96 rows=2 width=246)
               Index Cond: (("outer".token_id)::integer = (s.token_id)::integer)
               Filter: ((status)::text = 'sold'::text)
   ->  Index Scan using fkx_tokenhist__tokens on ta_tokenhist r  (cost=0.00..17.96 rows=1 width=246)
         Index Cond: (("outer".token_id)::integer = (r.token_id)::integer)
         Filter: ((status)::text = 'redeemed'::text)

This query returns a lot quicker than the plan would suggest, as the
planner is over-estimating the amount of rows where
((sarreport_id)::integer = 9). it thinks there are 53430 when in fact
there are only 7 (despite a vacuum and analyse).

Can anyone give me any suggestions? are the index stats the cause of
my problem, or is it the rewrite of the query?

Cheers

Version: PostgreSQL 8.0.3 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)

--

  - Rich Doughty

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org