Search Postgresql Archives

WHERE vs ORDER BY vs LIMIT why not using the correct index?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



All,

I have a simple query:

  SELECT tcq_id
  FROM queue q
  WHERE q.status = 'D'
  ORDER BY tcq_id ASC
  LIMIT 1;

What I'm trying to do is find all the items in the queue which have a status of 'D' and then select the oldest item first. My data is very skewed such that there are not many records matching the WHERE clause.

  status  my_count
  D              4
  Q              6
  S             20
  P             74
  F           3294
  C         291206

However, when I explain that query, it's doing an index scan on the 'tcq_id' primary key column instead of using the index on 'status':

------------------------------
    EXPLAIN SELECT tcq_id
    FROM queue q
    WHERE q.status = 'D'
    ORDER BY tcq_id ASC
    LIMIT 1;

 Limit  (cost=0.00..40.09 rows=1 width=8)
-> Index Scan using queue_pkey on queue q (cost=0.00..59059.80 rows=1473 width=8)
         Filter: (status = 'D'::bpchar)
------------------------------

But then when I take out the ORDER BY clause, PostgreSQL will use a different (and proper) index:

------------------------------
    EXPLAIN SELECT tcq_id
    FROM transcode_queue q
    WHERE q.status = 'D'
    LIMIT 1;

 Limit  (cost=0.00..3.81 rows=1 width=8)
-> Index Scan using queue_idx_status on queue q (cost=0.00..5618.07 rows=1473 width=8)
         Index Cond: (status = 'D'::bpchar)
------------------------------

I don't understand why the ORDER BY condition would be affecting my WHERE criteria. Shouldn't the ordering be done after the filter is first applied?

I'm wanting: "find the 4 rows where status = 'D' then order those by tcq_id and return the first one." But postgresql seems to be choosing: "order all records by the tcq_id then scan them sequentially and find the first one matching status = 'D'".

How can I influence the planner's decision while keeping my ORDER BY clause?

After furthing testing, maybe it's not the ORDER BY but the LIMIT that is causing the poor planner choice? I tried to do this:

------------------------------
    SELECT tcq_id
    FROM (
      SELECT tcq_id
      FROM queue q
      WHERE q.status = 'D'
    ) x
    ORDER BY x.tcq_id ASC
    LIMIT 1;

 Limit  (cost=0.00..40.09 rows=1 width=8)
-> Index Scan using queue_pkey on queue q (cost=0.00..59059.80 rows=1473 width=8)
         Filter: (status = 'D'::bpchar)
------------------------------

But this results in another wrong index choice. So, I removed the LIMIT clause and now it does use the right index:

------------------------------
    SELECT tcq_id
    FROM (
      SELECT tcq_id
      FROM queue q
      WHERE q.status = 'D'
    ) x
    ORDER BY x.tcq_id ASC;

 Sort  (cost=4314.36..4318.05 rows=1473 width=8)
   Sort Key: q.tcq_id
   ->  Bitmap Heap Scan on queue q  (cost=35.71..4236.85 rows=1473 width=8)
         Recheck Cond: (status = 'D'::bpchar)
-> Bitmap Index Scan on queue_idx_status (cost=0.00..35.34 rows=1473 width=0)
               Index Cond: (status = 'D'::bpchar)
------------------------------

Can someone shed some insight here and help me understand what's going on?

-- Dante


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org/

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux