I have a query that is being ran on PGSQL, and when queried at a fast rate for large data sets, it is taking a long time to run because it isn't making use of the available indexes. I found that changing the filter from multiple OR's to an IN clause causes the right index to be used. Is there a way I can force the index to be used even when using OR's?
Query with Disjunction:
SELECT field1, field2,..., fieldN
FROM table1 WHERE
((((field9='val1' OR field9='val2') OR field9='val3') OR field9='val4')
AND (field6='val5'));
Query Plan:
"Bitmap Heap Scan on table1 (cost=18.85..19.88 rows=1 width=395) (actual time=0.017..0.017 rows=0 loops=1)"
" Recheck Cond: (((field6)::text = 'val5'::text) AND (((field9)::text = 'val1'::text) OR ((field9)::text = 'val2'::text) OR ((field9)::text = 'val3'::text) OR ((field9)::text = 'val4'::text)))"
" -> BitmapAnd (cost=18.85..18.85 rows=1 width=0) (actual time=0.016..0.016 rows=0 loops=1)"
" -> Bitmap Index Scan on idx_field6_field9 (cost=0.00..9.01 rows=611 width=0) (actual time=0.015..0.015 rows=0 loops=1)"
" Index Cond: ((field6)::text = 'val5'::text)"
" -> BitmapOr (cost=9.59..9.59 rows=516 width=0) (never executed)"
" -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40 rows=129 width=0) (never executed)"
" Index Cond: ((field9)::text = 'val1'::text)"
" -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40 rows=129 width=0) (never executed)"
" Index Cond: ((field9)::text = 'val2'::text)"
" -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40 rows=129 width=0) (never executed)"
" Index Cond: ((field9)::text = 'val3'::text)"
" -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40 rows=129 width=0) (never executed)"
" Index Cond: ((field9)::text = 'val4'::text)"
"Planning time: 0.177 ms"
"Execution time: 0.061 ms"
Query with IN
SELECT field1, field2,..., fieldN
FROM table1
WHERE
((field9 IN ('val1', 'val2', 'val3', 'val4'))
AND (field6='val5'));
Query Plan:
"Index Scan using idx_field6_field9 on table1 (cost=0.43..6.77 rows=1 width=395) (actual time=0.032..0.032 rows=0 loops=1)"
" Index Cond: (((field6)::text = 'val5'::text) AND ((field9)::text = ANY ('{val1,val2,val3,val4}'::text[])))"
"Planning time: 0.145 ms"
"Execution time: 0.055 ms"
There is an index on field 6 and field 9 which the second query uses as expected, which the first one also should. Field9 is also kind of like a state field, so its cardinality is extremely low - there's only like 9 different values across the whole table. Unfortunately, it isn't straightforward to change the query to use an IN clause, so getting PG to use the right plan would be ideal, even if it means using a different type of index.
Thank you