All,
I have a simple query:
SELECT tcq_id
FROM queue q
WHERE q.status = 'D'
ORDER BY tcq_id ASC
LIMIT 1;
What I'm trying to do is find all the items in the queue which have a
status of 'D' and then select the oldest item first. My data is very
skewed such that there are not many records matching the WHERE clause.
status my_count
D 4
Q 6
S 20
P 74
F 3294
C 291206
However, when I explain that query, it's doing an index scan on the
'tcq_id' primary key column instead of using the index on 'status':
------------------------------
EXPLAIN SELECT tcq_id
FROM queue q
WHERE q.status = 'D'
ORDER BY tcq_id ASC
LIMIT 1;
Limit (cost=0.00..40.09 rows=1 width=8)
-> Index Scan using queue_pkey on queue q (cost=0.00..59059.80
rows=1473 width=8)
Filter: (status = 'D'::bpchar)
------------------------------
But then when I take out the ORDER BY clause, PostgreSQL will use a
different (and proper) index:
------------------------------
EXPLAIN SELECT tcq_id
FROM transcode_queue q
WHERE q.status = 'D'
LIMIT 1;
Limit (cost=0.00..3.81 rows=1 width=8)
-> Index Scan using queue_idx_status on queue q (cost=0.00..5618.07
rows=1473 width=8)
Index Cond: (status = 'D'::bpchar)
------------------------------
I don't understand why the ORDER BY condition would be affecting my
WHERE criteria. Shouldn't the ordering be done after the filter is
first applied?
I'm wanting: "find the 4 rows where status = 'D' then order those by
tcq_id and return the first one." But postgresql seems to be choosing:
"order all records by the tcq_id then scan them sequentially and find
the first one matching status = 'D'".
How can I influence the planner's decision while keeping my ORDER BY clause?
After furthing testing, maybe it's not the ORDER BY but the LIMIT that
is causing the poor planner choice? I tried to do this:
------------------------------
SELECT tcq_id
FROM (
SELECT tcq_id
FROM queue q
WHERE q.status = 'D'
) x
ORDER BY x.tcq_id ASC
LIMIT 1;
Limit (cost=0.00..40.09 rows=1 width=8)
-> Index Scan using queue_pkey on queue q (cost=0.00..59059.80
rows=1473 width=8)
Filter: (status = 'D'::bpchar)
------------------------------
But this results in another wrong index choice. So, I removed the LIMIT
clause and now it does use the right index:
------------------------------
SELECT tcq_id
FROM (
SELECT tcq_id
FROM queue q
WHERE q.status = 'D'
) x
ORDER BY x.tcq_id ASC;
Sort (cost=4314.36..4318.05 rows=1473 width=8)
Sort Key: q.tcq_id
-> Bitmap Heap Scan on queue q (cost=35.71..4236.85 rows=1473 width=8)
Recheck Cond: (status = 'D'::bpchar)
-> Bitmap Index Scan on queue_idx_status (cost=0.00..35.34
rows=1473 width=0)
Index Cond: (status = 'D'::bpchar)
------------------------------
Can someone shed some insight here and help me understand what's going on?
-- Dante
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org/