Planner sometimes doesn't use a relevant index with IN (subquery) condition

Rafał Rzepecki <divided.mind@xxxxxxxxx> · Sun, 11 Nov 2012 04:18:31 +0100

[Please CC me on replies, as I'm not subscribed; thank you.]

I've ran into a problem with the query planner and IN (subquery)
conditions which I suspect to be a bug. I'll attempt to describe the
relevant details of my database and explain which behaviour I find
unexpected. I've also tried to trigger this behaviour in a clean
database; I think I've succeeded, but the conditions are a bit
different, so perhaps it's a different problem. I'll describe this
setup in detail below.

I have a somewhat large table (~2.5M rows), stats, which is quite
often (several records a minute) INSERTed to, but never UPDATEd or
DELETEd from. (In case it's relevant, it has an attached AFTER INSERT
trigger which checks time and rebuilds an aggregate materialized view
every hour.) This is the schema:
# \d+ stats
                                                  Table "serverwatch.stats"
      Column      |            Type             |
Modifiers                      | Storage | Description
------------------+-----------------------------+----------------------------------------------------+---------+-------------
 id               | integer                     | not null default
nextval('stats_id_seq'::regclass) | plain   |
 run_id           | integer                     | not null
                              | plain   |
 start_time       | timestamp without time zone | not null
                              | plain   |
 end_time         | timestamp without time zone | not null
                              | plain   |
 cpu_utilization  | double precision            |
                              | plain   |
 disk_read_ops    | bigint                      |
                              | plain   |
 disk_write_ops   | bigint                      |
                              | plain   |
 network_out      | bigint                      |
                              | plain   |
 network_in       | bigint                      |
                              | plain   |
 disk_read_bytes  | bigint                      |
                              | plain   |
 disk_write_bytes | bigint                      |
                              | plain   |
Indexes:
    "stats_pkey" PRIMARY KEY, btree (id)
    "stats_day_index" btree (run_id, day(stats.*))
    "stats_month_index" btree (run_id, month(stats.*))
    "stats_week_index" btree (run_id, week(stats.*))
Foreign-key constraints:
    "stats_runs" FOREIGN KEY (run_id) REFERENCES runs(id)
Triggers:
    stats_day_refresh_trigger AFTER INSERT OR UPDATE ON stats FOR EACH
STATEMENT EXECUTE PROCEDURE mat_view_refresh('serverwatch.stats_day')
Has OIDs: no

day(), month() and week() functions are just trivial date_trunc on a
relevant field. The referenced table looks like this:
# \d+ runs
                                                 Table "serverwatch.runs"
     Column      |            Type             |
Modifiers                     | Storage | Description
-----------------+-----------------------------+---------------------------------------------------+---------+-------------
 id              | integer                     | not null default
nextval('runs_id_seq'::regclass) | plain   |
 server_id       | integer                     | not null
                            | plain   |
 flavor          | flavor                      | not null
                            | plain   |
 region          | region                      | not null
                            | plain   |
 launch_time     | timestamp without time zone | not null
                            | plain   |
 stop_time       | timestamp without time zone |
                            | plain   |
 project_info_id | integer                     | not null
                            | plain   |
 owner_info_id   | integer                     | not null
                            | plain   |
Indexes:
    "runs_pkey" PRIMARY KEY, btree (id)
    "index_runs_on_flavor" btree (flavor)
    "index_runs_on_owner_info_id" btree (owner_info_id)
    "index_runs_on_project_info_id" btree (project_info_id)
    "index_runs_on_region" btree (region)
    "index_runs_on_server_id" btree (server_id)
Foreign-key constraints:
    "runs_owner_info_id_fkey" FOREIGN KEY (owner_info_id) REFERENCES
user_infos(id)
    "runs_project_info_id_fkey" FOREIGN KEY (project_info_id)
REFERENCES project_infos(id)
Referenced by:
    TABLE "stats_day" CONSTRAINT "stats_day_runs" FOREIGN KEY (run_id)
REFERENCES runs(id)
    TABLE "stats" CONSTRAINT "stats_runs" FOREIGN KEY (run_id)
REFERENCES runs(id)
Has OIDs: no

Now consider this query - note I'm using a subselect here because the
problem originally manifested itself with a view:
SELECT * FROM (SELECT run_id, disk_write_ops FROM stats) AS s WHERE
run_id IN (SELECT id FROM runs WHERE server_id = 515);

As might be expected, the planner chooses to use one of the three
indices with run_id:
http://explain.depesz.com/s/XU3Q

Now consider a similar query, but with aggregation:
SELECT * FROM (SELECT run_id, SUM(disk_write_ops) FROM stats GROUP BY
run_id) AS s WHERE run_id IN (SELECT id FROM runs WHERE server_id =
515);

Now the picture is very different. The planner, unexplicably,
dismisses the index and opts instead to do a full scan on stats, the
table 2.5 million rows big.
http://explain.depesz.com/s/Rqt

Note that the problem disappears when we replace the IN condition with literal:
SELECT * FROM (SELECT run_id, SUM(disk_write_ops) FROM stats GROUP BY
run_id) AS s WHERE run_id IN (1815, 1816);

The ids are the result of the inner select ran separately, so the
query has the exact same result; it's worth pointing out that the
planner has a correct estimate on the selectivity of the condition -
exactly two rows from runs are selected, as expected. But when literal
is used the planner correctly chooses to use the index:
http://explain.depesz.com/s/lYc

Similarly a correct plan is chosen when we unnest the inner SELECT:
SELECT run_id, SUM(disk_write_ops) FROM stats WHERE run_id IN (SELECT
id FROM runs WHERE server_id = 515) GROUP BY run_id;
http://explain.depesz.com/s/dlwZ

I've tried to replicate this on a clean database:
CREATE TABLE runs(run_id serial PRIMARY KEY, server_id INTEGER NOT NULL);
CREATE INDEX runs_server ON runs(server_id);
CREATE TABLE stats(entry_id serial PRIMARY KEY, run_id integer
REFERENCES runs NOT NULL, utilization INTEGER NOT NULL);
CREATE INDEX stats_runs ON stats(run_id);

Now let's try some queries:
SELECT * FROM (SELECT run_id, utilization FROM stats) AS s WHERE
run_id IN (1212, 2323, 121, 561, 21, 561, 125, 2, 55, 52, 42);
http://explain.depesz.com/s/Kcb - fine, index used

SELECT * FROM (SELECT run_id, utilization FROM stats) AS s WHERE
run_id IN (SELECT run_id FROM runs WHERE server_id = 515);
http://explain.depesz.com/s/QFs - seqscan!
Obviously it doesn't mean much, as the tables are empty and there are
no stats, but still a radically different plan is chosen for what is
essentially the same query.

Note that in this case the behaviour is the same even when unnested:
SELECT run_id, utilization FROM stats WHERE run_id IN (SELECT run_id
FROM runs WHERE server_id = 515);
http://explain.depesz.com/s/y3GM

So, is this a bug in the planner, or am I somehow subtly changing the
semantics of the query and don't notice?
I understand the planner perhaps tries to parallelize queries when a
SELECT is used in the IN clause, but given the stats it doesn't seem
to make much sense.

Thanks, and let me know if you want me to test something on my
database over here or if there's some relevant info I've ommited.

(PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc
(Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2, 64-bit
running on Ubuntu 12.10, ubuntu package
postgresql-9.1-9.1.6-1ubuntu1:amd64, default configuration)
--
Rafał Rzepecki

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance