Using the wrong index (very suboptimal), why?

"Shane Wright" <shane.wright@xxxxxxxxxxxxxxxxxxxx> · Mon, 22 Dec 2008 18:16:43 -0000

Hi,

I have somewhat of a quandary with a large table in my database;
PostgreSQL is choosing the 'wrong' index for a certain kind of query;
causing performance to become an order of magnitude slower (query times
usually measured in milliseconds now become seconds/minutes!).

It's not that it isn't using any index (so enable_seqscan=off doesn't
help), it's that the index it picks is suboptimal.

The query is based on 3 of the table columns - there is an index on all
three, but it prefers to use an index on just two of them, then
filtering by the 3rd (EXPLAIN output is below - showing the query as
well).

I've tried everything I can find to make it not do this (config
parameters, statistics target, vacuuming more...)  - what am I missing?

It happened a couple of times before - after some unusual activity
patterns in a client script - but repeatedly vacuuming until the planner
changed its mind made it go away.  We then made config changes
(random_page_cost and effective_cache_size) thinking that fixed the
reason why it made the bad decision in the first place...but it would
appear not

The only thing remaining I can think of is rebuilding the 'preferred'
index; that should reduce its size from 27Gb down to something more
manageable; maybe that will let the database want to use it more because
there'll be less I/O?

I've considered upgrading to 8.3.5 as well, but I can't see anything in
the changelogs that would fix this issue (I appreciate upgrading is a
good idea anyway, that will be done at some point soon anyway).

Hopefully all relevant info is listed below - if anyone's got any ideas
I'd appreciate any help or pointers anyone can give, thanks...

The server is PostgreSQL 8.3.0 on Linux with 32Gb RAM.  /var/lib/pgsql/
is on a fibre-channel SAN.  This table has around 680 million rows - and
has been reasonably regularly vacuumed, but is probably in dire need of
a VACUUM FULL and REINDEX to reclaim dead space (see disk space info at
the bottom of the post).

emystery=# \d answers
                              Table "public.answers"
 Column |  Type   |                             Modifiers
--------+---------+-----------------------------------------------------
---------------
 aid    | integer | not null default
nextval(('seq_answers_aid'::text)::regclass)
 rid    | integer | not null
 nid    | integer | not null
 iid1   | integer |
 iid2   | integer |
 iid3   | integer |
 text   | text    |
 extra  | bigint  |
Indexes:
    "answers_pkey" PRIMARY KEY, btree (aid)
    "index_answers_iid1" btree (iid1) WHERE iid1 > 0
    "index_answers_iid2" btree (iid2) WHERE iid2 > 0
    "index_answers_iid3" btree (iid3) WHERE iid3 > 0
    "index_answers_nidiid1" btree (nid, iid1)
    "index_answers_ridnidiid1" btree (rid, nid, iid1)

This is what it is doing [slowly] - but for many values of rid/nid/iid1
there are a lot of rows to filter through (tens/hundreds of thousands)
so this can take many seconds or minutes:

emystery=# explain select * from answers where rid=1 and nid=2 and
iid1=3;
------------------------------------------------------------------------
-------------------------
 Index Scan using index_answers_nidiid1 on answers  (cost=0.00..28.74
rows=1 width=62)
   Index Cond: ((nid = 2) AND (iid1 = 3))
   Filter: (rid = 1)

This is the pattern it *should* use (and does use on other installations
of similar/older data).  When this pattern is used the query always
completes in <1 second, usually ~0.2 seconds!

emystery20080821=# explain select * from answers where rid=1 and nid=2
and iid1=3;
------------------------------------------------------------------------
----------------------------
 Index Scan using index_answers_ridnidiid1 on answers  (cost=0.00..99.04
rows=1 width=67)
   Index Cond: ((rid = 1) AND (nid = 2) AND (iid1 = 3))

The PostgreSQL configuration has been altered to help prefer random
lookups [via an index], and to indicate to the database how much data
the OS 

#seq_page_cost = 1.0                    # measured on an arbitrary scale
random_page_cost = 1.0                  # same scale as above
#cpu_tuple_cost = 0.01                  # same scale as above
#cpu_index_tuple_cost = 0.005           # same scale as above
#cpu_operator_cost = 0.0025             # same scale as above
effective_cache_size = 31GB

The table has been VACUUM ANALYSE'd (repeatedly!) to no avail.

We've also increased the statistics target for the columns in the table
to 200 (from 10) which still isn't making any difference (I thought
because the table is so large and the dataset is quite uneven that this
should help...it's now at 1000 and vacuuming again so we might be
lucky...)

alter table answers alter column rid set statistics 200;
alter table answers alter column nid set statistics 200;
alter table answers alter column iid1 set statistics 200;

Here is the [full] output from the most recent VACUUM:

emystery=# vacuum verbose analyse answers;
INFO:  vacuuming "public.answers"
INFO:  scanned index "index_answers_nidiid1" to remove 38 row versions
DETAIL:  CPU 39.09s/64.46u sec elapsed 1077.90 sec.
INFO:  scanned index "index_answers_iid3" to remove 38 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "index_answers_iid2" to remove 38 row versions
DETAIL:  CPU 19.72s/37.15u sec elapsed 404.70 sec.
INFO:  scanned index "index_answers_iid1" to remove 38 row versions
DETAIL:  CPU 28.92s/59.76u sec elapsed 414.62 sec.
INFO:  scanned index "index_answers_ridnidiid1" to remove 38 row
versions
DETAIL:  CPU 50.99s/71.36u sec elapsed 742.03 sec.
INFO:  scanned index "answers_pkey" to remove 38 row versions
DETAIL:  CPU 28.45s/65.63u sec elapsed 376.34 sec.
INFO:  "answers": removed 38 row versions in 3 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  index "index_answers_nidiid1" now contains 671965676 row versions
in 2646699 pages
DETAIL:  38 index row versions were removed.
2497 index pages have been deleted, 2497 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "index_answers_iid3" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "index_answers_iid2" now contains 392408456 row versions in
1302348 pages
DETAIL:  24 index row versions were removed.
9760 index pages have been deleted, 9760 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "index_answers_iid1" now contains 626231453 row versions in
2044032 pages
DETAIL:  38 index row versions were removed.
1576 index pages have been deleted, 1576 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "index_answers_ridnidiid1" now contains 671965676 row
versions in 3604651 pages
DETAIL:  38 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "answers_pkey" now contains 671965676 row versions in
2022030 pages
DETAIL:  38 index row versions were removed.
18670 index pages have been deleted, 18670 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "answers": found 38 removable, 671965676 nonremovable row
versions in 4481083 pages
DETAIL:  192913 dead row versions cannot be removed yet.
There were 39603896 unused item pointers.
380386 pages contain useful free space.
0 pages are entirely empty.
CPU 235.00s/366.10u sec elapsed 4100.83 sec.
INFO:  vacuuming "pg_toast.pg_toast_1118174"
INFO:  index "pg_toast_1118174_index" now contains 4479 row versions in
33 pages
DETAIL:  0 index row versions were removed.
1 index pages have been deleted, 1 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.25 sec.
INFO:  "pg_toast_1118174": found 0 removable, 4479 nonremovable row
versions in 1232 pages
DETAIL:  14 dead row versions cannot be removed yet.
There were 1971 unused item pointers.
441 pages contain useful free space.
0 pages are entirely empty.
CPU 0.01s/0.00u sec elapsed 1.18 sec.
INFO:  analyzing "public.answers"
INFO:  "answers": scanned 60000 of 4481083 pages, containing 9003061
live rows and 2964 dead rows; 60000 rows in sample, 672391060 estimated
total rows
VACUUM

This information might help tell you a bit more about the contents of
the table:

emystery=# SELECT relname, relkind, reltuples, relpages FROM pg_class
WHERE relname LIKE '%answers%';
           relname                 | relkind |  reltuples  | relpages
-----------------------------------+---------+-------------+----------
 answers                           | r       | 6.71966e+08 |  4481083
 seq_answers_aid                   | S       |           1 |        1
 index_answers_nidiid1             | i       | 6.71966e+08 |  2646699
 index_answers_iid3                | i       |           0 |        1
 index_answers_iid2                | i       | 3.92408e+08 |  1302348
 index_answers_iid1                | i       | 6.26231e+08 |  2044032
 index_answers_ridnidiid1          | i       | 6.71966e+08 |  3604651
 answers_pkey                      | i       | 6.71966e+08 |  2022030

emystery=# SELECT attname,null_frac, n_distinct,correlation FROM
pg_stats WHERE tablename = 'answers';
 attname | null_frac | n_distinct | correlation
---------+-----------+------------+-------------
 aid     |         0 |         -1 |     0.78259
 rid     |         0 |     187498 | -0.00133284
 nid     |         0 |      27179 |   0.0342285
 iid1    |    0.0668 |      49520 |   0.0306873
 iid2    |    0.4187 |       5140 |    0.317481
 iid3    |         1 |          0 |
 text    |  0.933117 |       2573 |    0.112144
 extra   |   0.99995 |         -1 |           1

Disk space usage, as calculated from pg_class.relpages * 8 (in
kilobytes) is:

Table data = 34.19Gb
Toast = 64Kb
index_ecos_answers_ridnidiid1	27.50Gb
index_ecos_answers_nidiid1	20.19Gb
index_ecos_answers_iid1	15.59Gb
ecos_answers_pkey	15.43Gb
index_ecos_answers_iid2	9.94Gb
index_ecos_answers_iid3	8.00Kb

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general