Re: extract(year from date) doesn't use index but maybe could?

Adam Tauno Williams <awilliam@xxxxxxxxxxxxx> · Sun, 19 Apr 2015 17:03:47 -0400

On Sun, 2015-04-19 at 13:10 -0700, Jon Dufresne wrote:
> On Sun, Apr 19, 2015 at 10:42 AM, Tomas Vondra
> <tomas.vondra@xxxxxxxxxxxxxxx> wrote:
> > On 04/19/15 19:16, Jon Dufresne wrote:
> >> Given the table:
> >> CREATE TABLE dates (id SERIAL, d DATE NOT NULL, t TEXT NOT NULL)
> >> With an *index* on field d. The following two queries are functionally
> >> equivalent:
> >> 1. SELECT * FROM dates WHERE d >= '1900-01-01'
> >> 2. SELECT * FROM dates WHERE EXTRACT(year from d) >= 1900'
> >> By functionally equivalent, they will return the same result set.
> >> Query 2 does not use the index, adding a performance cost. It seems
> >> there is an opportunity for optimization to handle these two queries
> >> equivalently to take advantage of the index.
> > Or you might try creating an expression index ..
> > CREATE INDEX date_year_idx ON dates((extract(year from d)));
> Certainly, but won't this add additional overhead in the form of two
> indexes; one for the column and one for the expression?
> My point is, why force the user to take these extra steps or add
> overhead when the the two queries (or two indexes) are functionally
> equivalent. 

But they aren't functionally equivalent.  One is an index on a
datetime/date, the other is an index just on the year [a DOUBLE].
Date/datetimes potentially have time zones, integer values do not - in
general time values are an order of magnitude more complicated than
people expect.

> Shouldn't this is an optimization handled by the database
> so the user doesn't need to hand optimize these differences?

Sometimes "d >= '1900-01-01'" and "EXTRACT(year from d) >= 1900" may be
equivalent; but not always.

-- 
Adam Tauno Williams <mailto:awilliam@xxxxxxxxxxxxx> GPG D95ED383
Systems Administrator, Python Developer, LPI / NCLA

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance