This indeed works around the issue. Thanks! On Mon, Nov 12, 2012 at 9:53 AM, ashutosh durugkar <ashucould@xxxxxxxxx> wrote: > Hey Rafal, > > >>SELECT * FROM (SELECT run_id, utilization FROM stats) AS s WHERE > run_id IN (SELECT run_id FROM runs WHERE server_id = 515); > > could you try this: > > > SELECT * FROM (SELECT run_id, utilization FROM stats) AS s WHERE > run_id = ANY(ARRAY(SELECT run_id FROM runs WHERE server_id = 515)); > > Thanks, > > On Sun, Nov 11, 2012 at 8:48 AM, Rafał Rzepecki <divided.mind@xxxxxxxxx> > wrote: >> >> [Please CC me on replies, as I'm not subscribed; thank you.] >> >> I've ran into a problem with the query planner and IN (subquery) >> conditions which I suspect to be a bug. I'll attempt to describe the >> relevant details of my database and explain which behaviour I find >> unexpected. I've also tried to trigger this behaviour in a clean >> database; I think I've succeeded, but the conditions are a bit >> different, so perhaps it's a different problem. I'll describe this >> setup in detail below. >> >> I have a somewhat large table (~2.5M rows), stats, which is quite >> often (several records a minute) INSERTed to, but never UPDATEd or >> DELETEd from. (In case it's relevant, it has an attached AFTER INSERT >> trigger which checks time and rebuilds an aggregate materialized view >> every hour.) This is the schema: >> # \d+ stats >> Table >> "serverwatch.stats" >> Column | Type | >> Modifiers | Storage | Description >> >> ------------------+-----------------------------+----------------------------------------------------+---------+------------- >> id | integer | not null default >> nextval('stats_id_seq'::regclass) | plain | >> run_id | integer | not null >> | plain | >> start_time | timestamp without time zone | not null >> | plain | >> end_time | timestamp without time zone | not null >> | plain | >> cpu_utilization | double precision | >> | plain | >> disk_read_ops | bigint | >> | plain | >> disk_write_ops | bigint | >> | plain | >> network_out | bigint | >> | plain | >> network_in | bigint | >> | plain | >> disk_read_bytes | bigint | >> | plain | >> disk_write_bytes | bigint | >> | plain | >> Indexes: >> "stats_pkey" PRIMARY KEY, btree (id) >> "stats_day_index" btree (run_id, day(stats.*)) >> "stats_month_index" btree (run_id, month(stats.*)) >> "stats_week_index" btree (run_id, week(stats.*)) >> Foreign-key constraints: >> "stats_runs" FOREIGN KEY (run_id) REFERENCES runs(id) >> Triggers: >> stats_day_refresh_trigger AFTER INSERT OR UPDATE ON stats FOR EACH >> STATEMENT EXECUTE PROCEDURE mat_view_refresh('serverwatch.stats_day') >> Has OIDs: no >> >> day(), month() and week() functions are just trivial date_trunc on a >> relevant field. The referenced table looks like this: >> # \d+ runs >> Table "serverwatch.runs" >> Column | Type | >> Modifiers | Storage | Description >> >> -----------------+-----------------------------+---------------------------------------------------+---------+------------- >> id | integer | not null default >> nextval('runs_id_seq'::regclass) | plain | >> server_id | integer | not null >> | plain | >> flavor | flavor | not null >> | plain | >> region | region | not null >> | plain | >> launch_time | timestamp without time zone | not null >> | plain | >> stop_time | timestamp without time zone | >> | plain | >> project_info_id | integer | not null >> | plain | >> owner_info_id | integer | not null >> | plain | >> Indexes: >> "runs_pkey" PRIMARY KEY, btree (id) >> "index_runs_on_flavor" btree (flavor) >> "index_runs_on_owner_info_id" btree (owner_info_id) >> "index_runs_on_project_info_id" btree (project_info_id) >> "index_runs_on_region" btree (region) >> "index_runs_on_server_id" btree (server_id) >> Foreign-key constraints: >> "runs_owner_info_id_fkey" FOREIGN KEY (owner_info_id) REFERENCES >> user_infos(id) >> "runs_project_info_id_fkey" FOREIGN KEY (project_info_id) >> REFERENCES project_infos(id) >> Referenced by: >> TABLE "stats_day" CONSTRAINT "stats_day_runs" FOREIGN KEY (run_id) >> REFERENCES runs(id) >> TABLE "stats" CONSTRAINT "stats_runs" FOREIGN KEY (run_id) >> REFERENCES runs(id) >> Has OIDs: no >> >> Now consider this query - note I'm using a subselect here because the >> problem originally manifested itself with a view: >> SELECT * FROM (SELECT run_id, disk_write_ops FROM stats) AS s WHERE >> run_id IN (SELECT id FROM runs WHERE server_id = 515); >> >> As might be expected, the planner chooses to use one of the three >> indices with run_id: >> http://explain.depesz.com/s/XU3Q >> >> Now consider a similar query, but with aggregation: >> SELECT * FROM (SELECT run_id, SUM(disk_write_ops) FROM stats GROUP BY >> run_id) AS s WHERE run_id IN (SELECT id FROM runs WHERE server_id = >> 515); >> >> Now the picture is very different. The planner, unexplicably, >> dismisses the index and opts instead to do a full scan on stats, the >> table 2.5 million rows big. >> http://explain.depesz.com/s/Rqt >> >> Note that the problem disappears when we replace the IN condition with >> literal: >> SELECT * FROM (SELECT run_id, SUM(disk_write_ops) FROM stats GROUP BY >> run_id) AS s WHERE run_id IN (1815, 1816); >> >> The ids are the result of the inner select ran separately, so the >> query has the exact same result; it's worth pointing out that the >> planner has a correct estimate on the selectivity of the condition - >> exactly two rows from runs are selected, as expected. But when literal >> is used the planner correctly chooses to use the index: >> http://explain.depesz.com/s/lYc >> >> Similarly a correct plan is chosen when we unnest the inner SELECT: >> SELECT run_id, SUM(disk_write_ops) FROM stats WHERE run_id IN (SELECT >> id FROM runs WHERE server_id = 515) GROUP BY run_id; >> http://explain.depesz.com/s/dlwZ >> >> >> I've tried to replicate this on a clean database: >> CREATE TABLE runs(run_id serial PRIMARY KEY, server_id INTEGER NOT NULL); >> CREATE INDEX runs_server ON runs(server_id); >> CREATE TABLE stats(entry_id serial PRIMARY KEY, run_id integer >> REFERENCES runs NOT NULL, utilization INTEGER NOT NULL); >> CREATE INDEX stats_runs ON stats(run_id); >> >> Now let's try some queries: >> SELECT * FROM (SELECT run_id, utilization FROM stats) AS s WHERE >> run_id IN (1212, 2323, 121, 561, 21, 561, 125, 2, 55, 52, 42); >> http://explain.depesz.com/s/Kcb - fine, index used >> >> SELECT * FROM (SELECT run_id, utilization FROM stats) AS s WHERE >> run_id IN (SELECT run_id FROM runs WHERE server_id = 515); >> http://explain.depesz.com/s/QFs - seqscan! >> Obviously it doesn't mean much, as the tables are empty and there are >> no stats, but still a radically different plan is chosen for what is >> essentially the same query. >> >> Note that in this case the behaviour is the same even when unnested: >> SELECT run_id, utilization FROM stats WHERE run_id IN (SELECT run_id >> FROM runs WHERE server_id = 515); >> http://explain.depesz.com/s/y3GM >> >> So, is this a bug in the planner, or am I somehow subtly changing the >> semantics of the query and don't notice? >> I understand the planner perhaps tries to parallelize queries when a >> SELECT is used in the IN clause, but given the stats it doesn't seem >> to make much sense. >> >> Thanks, and let me know if you want me to test something on my >> database over here or if there's some relevant info I've ommited. >> >> (PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc >> (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2, 64-bit >> running on Ubuntu 12.10, ubuntu package >> postgresql-9.1-9.1.6-1ubuntu1:amd64, default configuration) >> -- >> Rafał Rzepecki >> >> >> -- >> Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-performance > > -- Rafał Rzepecki -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance