Re: How to create a multi-column index with 2 dates using 'gist'?

Fred Janon <fjanon@xxxxxxxxx> · Tue, 25 Aug 2009 19:39:26 +0800

Thanks Sam. I looked at the gist documentation and although it would be fun, I don't have the time at the moment to explore that avenue (and scratching my head!). I also think it would require a lot of work testing to validate the code and that the gist index is better than the B-tree one. So I am following your advice using a B-tree index for now.

Basically I have an events table representing events with a duration (startdate, enddate). I was wondering if it would improve the performance if I was creating a separate table (indexed as you suggested) with the date ranges (startdate, enddate) and point to that from my events table. That would eliminate the duplicate ranges, costing a join to find the events within a date range, but maybe improving the search performance for events that overlap a certain date range. Any feedback on that?

Thanks

Fred

On Tue, Aug 25, 2009 at 18:52, Sam Mason <sam@xxxxxxxxxxxxx> wrote:

On Mon, Aug 24, 2009 at 05:24:59PM +0800, Fred Janon wrote:

> I am using 8.3 and pgAdmin III. I have a couple of tables using 2 DATE

> columns like 'startdate' and 'enddate' (just date, not interested in time in

> these columns). I have some queries (some using OVERLAPS) involving both

> 'startdate' and 'enddate' columns. I tried to create a multi column index

> using pgAdmin and it comes back with this error:

>

> ERROR: data type date has no default operator class for access method "gist"

> HINT: You must specify an operator class for the index or define a default

> operator class for the data type.

I've not had the opportunity to try doing this, but it would seem to

require hacking some C code to get this working.  Have a look here:

  http://www.postgresql.org/docs/current/static/gist.html

> I search the pdf docs and online without finding what an "operator class"

> for DATE would be. Would a multi-column index help in that case (OVERLAPS

> and dates comparison) anyway? Or should I just define an index for each of

> the dates?

An operator class bundles together various bits of code so that the

index knows which functions to call when it needs to compare things.

If you were creating an GiST index over a pair of dates to support

an "overlaps" operator you'd have to define a set of functions that

implement the various checks needed.

Depending on your data you may be easier with just a multi-column index

and using normal comparisons, I can't see how OVERLAPS could use indexes

as it does some strange things with NULL values.  The cases a B-Tree

index would win over GiST (this is an educated guess) is when few of the

ranges overlap within a table.  If that's the case then I'd do:

  CREATE INDEX tbl_start_end_idx ON tbl (startdate,enddate);

to create the btree index (they're the default, so nothing else is

needed) and then write queries as:

  SELECT r.range, t.*

  FROM tbl t, ranges r

  WHERE t.startdate <= r.rangeend

    AND t.enddate   >= r.rangestart;

if there are lots of overlapping ranges in the table then this is going

to do badly and you may need to start thinking about writing some C code

to get a GiST index going.

--

  Sam  http://samason.me.uk/

--

Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)

To make changes to your subscription:

http://www.postgresql.org/mailpref/pgsql-general