limit clause breaks query planner?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

I have a single table with about 10 million rows, and two indexes.  Index A is on a column A with 95% null values.  Index B is on a column B with about 10 values, ie. About a million rows of each value.

 

When I do a simple query on the table (no joins) with the following condition:

A is null AND

B = ‘21’

 

it uses the correct index, index B.  However, when I add a limit clause of 15, postgres decides to do a sequential scan :s.  Looking at the results from explain:

 

"Limit  (cost=0.00..3.69 rows=15 width=128)"

"  ->  Seq Scan on my_table this_  (cost=0.00..252424.24 rows=1025157 width=128)"

"        Filter: ((A IS NULL) AND ((B)::text = '21'::text))"

 

It appears that postgres is (very incorrectly) assuming that it will only have to retrieve 15 rows on a sequential scan, and gives a total cost of 3.69.  In reality, it has to scan many rows forward until it finds the correct value, yielding very poor performance for my table.

 

If I disable sequential scan (set enable_seqscan=false) it then incorrectly uses the index A that has 95% null values: it seems to incorrectly apply the same logic again that it will only have to retrieve 15 rows with the limit clause, and thinks that the index scan using A is faster than index scan B.

 

Only by deleting the index on A and disabling sequential scan will it use the correct index, which is of course by far the fastest.

 

Is there an assumption in the planner that a limit of 15 will mean that postgres will only have to read 15 rows?  If so is this a bad assumption?  If a particular query is faster without a limit, then surely it will also be faster with the limit.

 

Any workarounds for this?

 

Thanks

David


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux