Should the optimiser convert a CASE into a WHERE if it can?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear All,

Just wondering whether there is a missing scope for the query planner (on 8.4.2) to be cleverer than it currently is.

Specifically, I wonder whether the optimiser should know that by converting a CASE condition into a WHERE condition, it can use an index.

Have I found a possible enhancement, or is this simply too hard to do?

Best wishes,

Richard



Example:
--------

In this example, tbl_tracker has 255751 rows, with a primary key "id", whose values lie uniformly in the range 1...1255750.

If one is trying to count multiple conditions, the following query seems to be the most obvious way to do it:

SELECT
  SUM (case when id > 1200000 and id < 1210000 then 1 else 0 end) AS c1,
  SUM (case when id > 1210000 and id < 1220000 then 1 else 0 end) AS c2,
  SUM (case when id > 1220000 and id < 1230000 then 1 else 0 end) AS c3,
  SUM (case when id > 1230000 and id < 1240000 then 1 else 0 end) AS c4,
  SUM (case when id > 1240000 and id < 1250000 then 1 else 0 end) AS c5
FROM tbl_tracker;


  c1  |  c2  |  c3  |  c4  |  c5
------+------+------+------+------
 2009 | 2018 | 2099 | 2051 | 2030

Time: 361.666 ms



This can be manually optimised into a far uglier (but much much faster) query:

SELECT * FROM
 (SELECT COUNT (1) AS c1 FROM tbl_tracker
    WHERE id > 1200000 and id < 1210000) AS s1,
 (SELECT COUNT (1) AS c2 FROM tbl_tracker
    WHERE id > 1210000 and id < 1220000) AS s2,
 (SELECT COUNT (1) AS c3 FROM tbl_tracker
    WHERE id > 1220000 and id < 1230000) AS s3,
 (SELECT COUNT (1) AS c4 FROM tbl_tracker
    WHERE id > 1230000 and id < 1240000) AS s4,
 (SELECT COUNT (1) AS c5 FROM tbl_tracker
    WHERE id > 1240000 and id < 1250000) AS s5

  c1  |  c2  |  c3  |  c4  |  c5
------+------+------+------+------
 2009 | 2018 | 2099 | 2051 | 2030
(1 row)

Time: 21.091 ms





Debugging
---------

The simple queries are:

SELECT SUM (case when id > 1200000 and id < 1210000 then 1 else 0 end) from tbl_tracker;

Time: 174.804 ms

Explain shows that this does a sequential scan.



SELECT COUNT(1) from tbl_tracker WHERE id > 1200000 and id < 1210000;

Time: 4.153 ms

Explain shows that this uses the index, as expected.



--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux