Re: Postgres optimizer choosing wrong index

Jack Orenstein <jack.orenstein@xxxxxxx> · Mon, 27 Oct 2008 17:05:12 -0400

Tom Lane wrote:
Jack Orenstein <jack.orenstein@xxxxxxx> writes:
- I created two schemas, NOVAC and VAC, each with a table T as described above.

- Before loading data, I ran VACUUM ANALYZE on VAC.T.

- I then started loading data. The workload is a mixture of INSERT,   SELECT and 
UPDATE. For SELECT and UPDATE the WHERE clause always includes "dh = ? and fh = ?".

Basically your problem here is that vacuum records the size of the table
as zero (in pg_class.relpages/reltuples) and that causes the computed
costs of the two indexscans to be exactly the same, so it's a tossup
which one gets used.  (In recent versions I think the index with higher
OID would typically get chosen in a tie, but I forget if 7.4 worked that
way.)

8.0 and up are smart enough not to believe pg_class.relpages anymore
after you've loaded a lot of data, but 7.4 isn't.  In testing similar
cases here, I get reasonable cost estimates and a sane plan choice
from 7.4 so long as the stats are up to date.

Bottom line: you need to vacuum (or preferably analyze) *after*
initially populating a table, not before.

OK, I've added this behavior to my application. As the table is being loaded, I 
run VACUUM ANALYZE every 500 inserts, until we get to size 10,000. I know this 
is working because of application-level logging, and because I see relpages and 
reltuples go up.

EXPLAIN says that the correct index is being used -- it didn't used to. However, 
pg_stat* says otherwise. In my test, I have exactly one dh value. Running 
EXPLAIN with this value produces a plan using idx_dh (the correct index), but 
pg_stats says that idx_dn is being used (see psql session below).

This eventually works itself out. Eventually, the pg_stats for idx_dh start 
going up, showing that that index is eventually being used. But this discrepancy 
between EXPLAIN and actual query execution is making life very difficult.

Is the discrepancy between EXPLAIN and pg_stats due to some sort of caching per 
connection? E.g., a connection that uses one plan for a query is stuck with that 
plan for that query?

What would really be nice is a logging option that reported the execution plan 
actually used for a query.

Jack

ris=# \d t;
               Table "vac.t"
    Column    |          Type          | Modifiers
--------------+------------------------+-----------
 dh           | integer                | not null
 fh           | integer                | not null
 nm           | bigint                 |
...
Indexes:
    "idx_dn" btree (dh, nm)
    "idx_dh" btree (dh, fh)

ris=# select dh, count(*) from t group by dh;
 dh        | count
-----------+-------
 589849733 | 19890
(1 row)

ris=# explain select * from t where dh = 589849733 and fh = 0;
                                    QUERY PLAN
------------------------------------------------------------------
 Index Scan using idx_dh on t  (cost=0.00..5.26 rows=2 width=570)
   Index Cond: ((dh = 589849733) AND (fh = 0))
(2 rows)

ris=# select schemaname, indexrelname, idx_scan, idx_tup_read, idx_tup_fetch 
from pg_stat_user_indexes where schemaname = 'vac' and relname = 't' order by 1, 2;
 schemaname | indexrelname | idx_scan | idx_tup_read | idx_tup_fetch
------------+--------------+----------+--------------+---------------
 vac        | idx_dn       |    31315 |    122773990 |     122773990
 vac        | idx_dh       |        0 |            0 |             0

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general