Sorry for the amateurish question, but what are "heap tuples"?
Also, my understanding is that the following statement applies only for
composite indexes: "PostgreSQL can't use the values stored in the index
to check the join condition". I assume that PostgreSQL will be able to
use single-column-indexes for join conditions. Is this correct?
Thank you,
Peter
Heikki Linnakangas wrote:
Arnaud Lesauvage wrote:
I have two table with a 2-column index on both of them.
In the first table, the first colum of the index is the primary key,
the second one is an integer field.
In the second table, the two columns are the primary key.
When I join these two tables, the 2-column index of the first table
is not used.
Why does the query planner think that this plan is better ?
ALTER TABLE geo.subcities_names
ADD CONSTRAINT subcities_names_pkey PRIMARY KEY(subcity_gid,
language_id);
CREATE INDEX subcities_gid_language_id
ON geo.subcities
USING btree
(gid, official_language_id);
EXPLAIN ANALYZE
SELECT * FROM geo.subcities sc, geo.subcities_names scn
WHERE sc.gid = scn.subcity_gid AND sc.official_language_id =
scn.language_id;
My theory:
There's no additional restrictions besides the join condition, so the
system has to scan both tables completely. It chooses to use a full
index scan instead of a seq scan to be able to do a merge join.
Because it's going to have to scan the indexes completely anyway, it
chooses the smallest index which is subcities_pkey.
You'd think that the system could do the merge using just the indexes,
and only fetch the heap tuples for matches. If that were the case,
using the 2-column index would indeed be a good idea. However,
PostgreSQL can't use the values stored in the index to check the join
condition, so all the heap tuples are fetched anyway. There was just
recently discussion about this on this list:
http://archives.postgresql.org/pgsql-performance/2006-09/msg00080.php.