On Sun, 2015-04-19 at 13:10 -0700, Jon Dufresne wrote: > On Sun, Apr 19, 2015 at 10:42 AM, Tomas Vondra > <tomas.vondra@xxxxxxxxxxxxxxx> wrote: > > On 04/19/15 19:16, Jon Dufresne wrote: > >> Given the table: > >> CREATE TABLE dates (id SERIAL, d DATE NOT NULL, t TEXT NOT NULL) > >> With an *index* on field d. The following two queries are functionally > >> equivalent: > >> 1. SELECT * FROM dates WHERE d >= '1900-01-01' > >> 2. SELECT * FROM dates WHERE EXTRACT(year from d) >= 1900' > >> By functionally equivalent, they will return the same result set. > >> Query 2 does not use the index, adding a performance cost. It seems > >> there is an opportunity for optimization to handle these two queries > >> equivalently to take advantage of the index. > > Or you might try creating an expression index .. > > CREATE INDEX date_year_idx ON dates((extract(year from d))); > Certainly, but won't this add additional overhead in the form of two > indexes; one for the column and one for the expression? > My point is, why force the user to take these extra steps or add > overhead when the the two queries (or two indexes) are functionally > equivalent. But they aren't functionally equivalent. One is an index on a datetime/date, the other is an index just on the year [a DOUBLE]. Date/datetimes potentially have time zones, integer values do not - in general time values are an order of magnitude more complicated than people expect. > Shouldn't this is an optimization handled by the database > so the user doesn't need to hand optimize these differences? Sometimes "d >= '1900-01-01'" and "EXTRACT(year from d) >= 1900" may be equivalent; but not always. -- Adam Tauno Williams <mailto:awilliam@xxxxxxxxxxxxx> GPG D95ED383 Systems Administrator, Python Developer, LPI / NCLA -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance