Re: BUG #2658: Query not using index

Graham Davis <gdavis@xxxxxxxxxxxxxxx> · Tue, 03 Oct 2006 14:23:26 -0700

The "summary table" approach maintained by triggers is something we are 
considering, but it becomes a bit more complicated to implement.  
Currently we have groups of new positions coming in every few seconds or 
less.  They are not guaranteed to be in order.  So for instance, a group 
of positions from today could come in and be inserted, then a group of 
positions that got lost from yesterday could come in and be inserted 
afterwards. 

This means the triggers would have to do some sort of logic to figure 
out if the newly inserted position is actually the most recent by 
timestamp.  If positions are ever deleted or updated, the same sort of 
query that is currently running slow will need to be executed in order 
to get the new most recent position.  So there is the possibility that 
new positions can be inserted faster than the triggers can calculate 
and  maintain the summary table.  There are some other complications 
with maintaining such a summary table in our system too, but I won't get 
into those.

Right now I'm just trying to see if I can get the query itself running 
faster, which would be the easiest solution for now.

Graham.

Mark Lewis wrote:

Have you looked into a materialized view sort of approach?  You could
create a table which had assetid as a primary key, and max_ts as a
column.  Then use triggers to keep that table up to date as rows are
added/updated/removed from the main table.

This approach would only make sense if there were far fewer distinct
assetid values than rows in the main table, and would get slow if you
commonly delete rows from the main table or decrease the value for ts in
the row with the highest ts for a given assetid.

-- Mark Lewis

On Tue, 2006-10-03 at 13:52 -0700, Graham Davis wrote:

Thanks Tom, that explains it and makes sense.  I guess I will have to 
accept this query taking 40 seconds, unless I can figure out another way 
to write it so it can use indexes.  If there are any more syntax 
suggestions, please pass them on.  Thanks for the help everyone.

Graham.

Tom Lane wrote:

Graham Davis <gdavis@xxxxxxxxxxxxxxx> writes:

How come an aggreate like that has to use a sequential scan?  I know 
that PostgreSQL use to have to do a sequential scan for all aggregates, 
but there was support added to version 8 so that aggregates would take 
advantage of indexes.

Not in a GROUP BY context, only for the simple case.  Per the comment in
planagg.c:

	 * We don't handle GROUP BY, because our current implementations of
	 * grouping require looking at all the rows anyway, and so there's not
	 * much point in optimizing MIN/MAX.

The problem is that using an index to obtain the maximum value of ts for
a given value of assetid is not the same thing as finding out what all
the distinct values of assetid are.

This could possibly be improved but it would take a considerable amount
more work.  It's definitely not in the category of "bug fix".

			regards, tom lane

--
Graham Davis
Refractions Research Inc.
gdavis@xxxxxxxxxxxxxxx