Re: Bad query plan with high-cardinality column

Kevin Grittner <kgrittn@xxxxxxxxx> · Fri, 22 Feb 2013 12:47:56 -0800 (PST)

Alexander Staubo <alex@xxxxxxxxxx> wrote:

> This is my schema:
>
>   create table comments (
>     id serial primary key,
>     conversation_id integer,
>     created_at timestamp
>   );
>   create index comments_conversation_id_index on comments (conversation_id);
>   create index comments_created_at_index on comments (created_at);

I suspect you would be better off without those two indexes, and
instead having an index on (conversation_id, created_at).  Not just
for the query you show, but in general.

>   select comments.id from comments where
>     conversation_id = 3975979 order by created_at limit 13
>
> This filters about 5000 rows and returns the oldest 13 rows. But
> the query is consistently planned wrong:

> [planner thinks it will be cheaper to read index in ORDER BY
> sequence and filter rows until it has 13 than to read 5471 rows
> and sort them to pick the top 13 after the sort.]

In my experience these problems come largely from the planner not
knowing the cost of dealing with each tuple.  I see a lot less of
this if I raise cpu_tuple_cost to something in the 0.03 to 0.05
range.

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance