Re: BUG #2658: Query not using index

Mark Lewis <mark.lewis@xxxxxxxx> · Tue, 03 Oct 2006 14:34:27 -0700



Hmmm.  How many distinct assetids are there?
-- Mark Lewis

On Tue, 2006-10-03 at 14:23 -0700, Graham Davis wrote:
> The "summary table" approach maintained by triggers is something we are 
> considering, but it becomes a bit more complicated to implement.  
> Currently we have groups of new positions coming in every few seconds or 
> less.  They are not guaranteed to be in order.  So for instance, a group 
> of positions from today could come in and be inserted, then a group of 
> positions that got lost from yesterday could come in and be inserted 
> afterwards. 
> 
> This means the triggers would have to do some sort of logic to figure 
> out if the newly inserted position is actually the most recent by 
> timestamp.  If positions are ever deleted or updated, the same sort of 
> query that is currently running slow will need to be executed in order 
> to get the new most recent position.  So there is the possibility that 
> new positions can be inserted faster than the triggers can calculate 
> and  maintain the summary table.  There are some other complications 
> with maintaining such a summary table in our system too, but I won't get 
> into those.
> 
> Right now I'm just trying to see if I can get the query itself running 
> faster, which would be the easiest solution for now.
> 
> Graham.
> 
> 
> Mark Lewis wrote:
> 
> >Have you looked into a materialized view sort of approach?  You could
> >create a table which had assetid as a primary key, and max_ts as a
> >column.  Then use triggers to keep that table up to date as rows are
> >added/updated/removed from the main table.
> >
> >This approach would only make sense if there were far fewer distinct
> >assetid values than rows in the main table, and would get slow if you
> >commonly delete rows from the main table or decrease the value for ts in
> >the row with the highest ts for a given assetid.
> >
> >-- Mark Lewis
> >
> >On Tue, 2006-10-03 at 13:52 -0700, Graham Davis wrote:
> >  
> >
> >>Thanks Tom, that explains it and makes sense.  I guess I will have to 
> >>accept this query taking 40 seconds, unless I can figure out another way 
> >>to write it so it can use indexes.  If there are any more syntax 
> >>suggestions, please pass them on.  Thanks for the help everyone.
> >>
> >>Graham.
> >>
> >>
> >>Tom Lane wrote:
> >>
> >>    
> >>
> >>>Graham Davis <gdavis@xxxxxxxxxxxxxxx> writes:
> >>> 
> >>>
> >>>      
> >>>
> >>>>How come an aggreate like that has to use a sequential scan?  I know 
> >>>>that PostgreSQL use to have to do a sequential scan for all aggregates, 
> >>>>but there was support added to version 8 so that aggregates would take 
> >>>>advantage of indexes.
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>>Not in a GROUP BY context, only for the simple case.  Per the comment in
> >>>planagg.c:
> >>>
> >>>	 * We don't handle GROUP BY, because our current implementations of
> >>>	 * grouping require looking at all the rows anyway, and so there's not
> >>>	 * much point in optimizing MIN/MAX.
> >>>
> >>>The problem is that using an index to obtain the maximum value of ts for
> >>>a given value of assetid is not the same thing as finding out what all
> >>>the distinct values of assetid are.
> >>>
> >>>This could possibly be improved but it would take a considerable amount
> >>>more work.  It's definitely not in the category of "bug fix".
> >>>
> >>>			regards, tom lane
> >>> 
> >>>
> >>>      
> >>>
> >>    
> >>
> 
>