Postgres 9.0 has a bias against indexes

Mladen Gogala <mladen.gogala@xxxxxxxxxxx> · Thu, 27 Jan 2011 10:41:08 -0500

I have a table EMP, with 14 rows and a description like this:
scott=> \d+ emp
                             Table "public.emp"
  Column  |            Type             | Modifiers | Storage  | 
Description
----------+-----------------------------+-----------+----------+-------------
 empno    | smallint                    | not null  | plain    |
 ename    | character varying(10)       |           | extended |
 job      | character varying(9)        |           | extended |
 mgr      | smallint                    |           | plain    |
 hiredate | timestamp without time zone |           | plain    |
 sal      | double precision            |           | plain    |
 comm     | double precision            |           | plain    |
 deptno   | smallint                    |           | plain    |
Indexes:
    "emp_pkey" PRIMARY KEY, btree (empno)
    "emp_mgr_i" btree (mgr)
Foreign-key constraints:
    "fk_deptno" FOREIGN KEY (deptno) REFERENCES dept(deptno)
Has OIDs: no

scott=>

A recursive query doesn't use existing index on mgr:
scott=> explain analyze
with recursive e(empno,ename,mgr,bossname,level) as (
select empno,ename,mgr,NULL::varchar,0 from emp where empno=7839
union
select emp.empno,emp.ename,emp.mgr,e.ename,e.level+1
from emp,e
where emp.mgr=e.empno)
select * from e;
                                                         QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
 CTE Scan on e  (cost=20.59..23.21 rows=131 width=78) (actual 
time=0.020..0.143 rows=14 loops=1)
   CTE e
     ->  Recursive Union  (cost=0.00..20.59 rows=131 width=52) (actual 
time=0.018..0.128 rows=14 loops=1)
           ->  Seq Scan on emp  (cost=0.00..1.18 rows=1 width=10) 
(actual time=0.013..0.015 rows=1 loops=1)
                 Filter: (empno = 7839)
           ->  Hash Join  (cost=0.33..1.68 rows=13 width=52) (actual 
time=0.016..0.021 rows=3 loops=4)
                 Hash Cond: (public.emp.mgr = e.empno)
                 ->  Seq Scan on emp  (cost=0.00..1.14 rows=14 
width=10) (actual time=0.001..0.004 rows=14 loops=4)
                 ->  Hash  (cost=0.20..0.20 rows=10 width=44) (actual 
time=0.004..0.004 rows=4 loops=4)
                       Buckets: 1024  Batches: 1  Memory Usage: 1kB
                       ->  WorkTable Scan on e  (cost=0.00..0.20 
rows=10 width=44) (actual time=0.001..0.002 rows=4 loops=4)
 Total runtime: 0.218 ms
(12 rows)

scott=>

The optimizer will not use index, not even when I turn off both hash and 
merge joins. This is not particularly important for a table with 14 
rows, but for a larger table, this is a problem. The
only way to actually force the use of index is by disabling seqscan, but 
that chooses a wrong path
again, because it reads the "outer" table by primary key, which will be 
very slow. Full table scan,
done by the primary key is probably the slowest thing around. I know 
about the PostgreSQL philosophy
which says "hints are bad", and I deeply disagree with it, but would it 
be possible to have at
least one parameter that would change calculations in such a way that 
indexes are favored, where they exist?

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance