Re: Simple Join

Mark Kirkwood <markir@xxxxxxxxxxxxxxx> · Wed, 21 Dec 2005 19:55:20 +1300

Mark Kirkwood wrote:
Kevin Brown wrote:

I'll just start by warning that I'm new-ish to postgresql.

I'm running 8.1 installed from source on a Debian Sarge server.  I 
have a simple query that I believe I've placed the indexes correctly 
for, and I still end up with a seq scan.  It makes sense, kinda, but 
it should be able to use the index to gather the right values.  I do 
have a production set of data inserted into the tables, so this is 
running realistically:

dli=# explain analyze SELECT ordered_products.product_id
dli-# FROM to_ship, ordered_products
dli-# WHERE to_ship.ordered_product_id = ordered_products.id AND
dli-# ordered_products.paid = TRUE AND
dli-# ordered_products.suspended_sub = FALSE;

You scan 600000 rows from to_ship to get about 25000 - so some way to 
cut this down would help.

Try out an explicit INNER JOIN which includes the filter info for paid 
and suspended_sub in the join condition (you may need indexes on each of 
id, paid and suspended_sub, so that the 8.1 optimizer can use a bitmap 
scan):

SELECT ordered_products.product_id
FROM to_ship INNER JOIN ordered_products
ON (to_ship.ordered_product_id = ordered_products.id
    AND ordered_products.paid = TRUE      AND 
ordered_products.suspended_sub = FALSE);

It has been a quiet day today, so I took another look at this. If the 
selectivity of clauses :

paid = TRUE
suspended_sub = FALSE

is fairly high, then rewriting as a subquery might help:

SELECT o.product_id
FROM ordered_products o
WHERE o.paid = TRUE
AND o.suspended_sub = FALSE
AND  EXISTS (
  SELECT 1
  FROM to_ship s
  WHERE s.ordered_product_id = o.id
);

However it depends on you not needing anything from to_ship in the 
SELECT list...

Cheers

Mark