Bitmap index scan is bliss. Many thanks to the postgres team ! Now
searching in tables with a lot of fields and conditions is no longer a
pain.
And just a thought :
SELECT * FROM table WHERE category IN (1,2,3) ORDER BY price LIMIT 10;
Suppose you have an index on category, and another index on price.
Depending on the stats postgres has about the values, you'll either get :
0- seq scan + sort
1- Plain or Bitmap Index scan using "category", then sort by "price"
2- Index scan on "price", Filter on "category IN (1,2,3)", no sort.
1 is efficient if the category is rare. Postgres knows this and uses this
plan well.
Without a LIMIT, option 1 should be preferred.
2 is efficient if the items in the categories 1,2,3 are cheap (close to
the start of the index on price). However if the items in question are on
the other side of the index, it will index-scan a large part of the table.
This can be a big hit. Postgres has no stats about the correlation of
"category" and "price", so it won't know when there is going to be a
problem.
Another option would be interesting. It has two steps :
- Build a bitmap using the index on "category" (just like in case 1)
so we know which pages on the table have relevant rows
- Index scan on "price", but only looking in the heap for pages which are
flagged in the bitmap, and then "Recheck Cond" on "category".
In other words, do an index scan to get the rows in the right order, but
don't bother to check the heap for pages where the bitmap says there are
no rows.
In the worst case, you still have to run through the entire index, but at
least not through the entire table !
It can also speed up some merge joins.
What do you think ?