When i read about 8.3 support indexed queries on NULL values (like rubric_id is NULL) i was really happy.
But reality strike again... look like NULL in WHERE don't allow effective using
index on (rubric_id, pos) for queries like:
... WHERE rubric_id IS NULL ORDER BY pos LIMIT 5
Here is some details about my issue (all tests on fresh loaded/analyzed into empty 8.3.5 DB):
mboguk_billing=# EXPLAIN ANALYZE SELECT * from cluster_weight where cluster_weight.rubric_id=8 order by pos limit 5;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1337.02..1337.03 rows=5 width=28) (actual time=27.556..27.575 rows=5 loops=1)
-> Sort (cost=1337.02..1340.77 rows=1501 width=28) (actual time=27.552..27.558 rows=5 loops=1)
Sort Key: pos
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on cluster_weight (cost=0.00..1312.09 rows=1501 width=28) (actual time=0.058..25.008 rows=1501 loops=1)
Filter: (rubric_id = 8)
Total runtime: 27.638 ms
(7 rows)
ok so we need index on (rubric_id, pos), lets add it:
mboguk_billing=# CREATE INDEX cluster_weight_2 on cluster_weight(rubric_id, pos);
CREATE INDEX
And try again:
mboguk_billing=# EXPLAIN ANALYZE SELECT * from cluster_weight where cluster_weight.rubric_id=8 order by pos limit 5;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..1.70 rows=5 width=28) (actual time=0.095..0.122 rows=5 loops=1)
-> Index Scan using cluster_weight_2 on cluster_weight (cost=0.00..509.31 rows=1501 width=28) (actual time=0.090..0.104 rows=5 loops=1)
Index Cond: (rubric_id = 8)
Total runtime: 0.176 ms
(4 rows)
Ok... so now query works as intended...
Lets check are index used on search NULL values:
mboguk_billing=# EXPLAIN ANALYZE SELECT * from cluster_weight where cluster_weight.rubric_id IS NULL;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using cluster_weight_2 on cluster_weight (cost=0.00..1034.21 rows=26435 width=28) (actual time=0.053..48.123 rows=26435 loops=1)
Index Cond: (rubric_id IS NULL)
Total runtime: 85.210 ms
(3 rows)
Yes it is working...
Now lets try main query over NULL:
mboguk_billing=# EXPLAIN ANALYZE SELECT * from cluster_weight where cluster_weight.rubric_id IS NULL ORDER BY pos LIMIT 5;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1473.29..1473.30 rows=5 width=28) (actual time=92.220..92.239 rows=5 loops=1)
-> Sort (cost=1473.29..1539.37 rows=26435 width=28) (actual time=92.216..92.223 rows=5 loops=1)
Sort Key: pos
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using cluster_weight_2 on cluster_weight (cost=0.00..1034.21 rows=26435 width=28) (actual time=0.033..47.333 rows=26435 loops=1)
Index Cond: (rubric_id IS NULL)
Total runtime: 92.310 ms
(7 rows)
Ooops... that is surprise...
I wasn't ready see that plan here... and performance difference over 1000.
Looks like when indexed search over NULL's added, planner wasn't learned right way use such index on "where something is NULL order by ... limit ..."
queries.
Thanks for any responses and sorry for not so good English.
--
SY, Maxim Boguk
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general