Search Postgresql Archives

Re: Slow query and indexes...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There are other ways to influence the selection of a seqscan, notably effective_cache_size and random_page_cost.

First, you need to find out at what point a seqscan is actually faster than an index scan. That's going to be a trial and error search, but eventually if you're going back far enough in time the seqscan will be faster. EXPLAIN ANALYZE has it's own overhead, so a better way to test this is with psql's timing command, and wrap the query into a count so you're not shoving a bunch of data across to psql:

SELECT count(*) FROM (... your query goes here ...) a;

(SELECT 1 might work too and would be more accurate)

Once you've found the break even point, you can tweak all the cost estimates. Start by making sure that effective_cache_size is set approximately to how much memory you have. Increasing that will favor an index scan. Decreasing random_page_cost will also favor an index scan, though I'd try not to go below 2 and definitely not below 1. You can also tweak the CPU cost estimates (lower numbers will favor indexes). But keep in mind that doing that at a system level will impact every query running in the system. You may have no choice but to explicitly set custom parameters for just this statement. SET LOCAL and wrapping the SELECT in a transaction is a less painful way to do that.

On May 7, 2007, at 10:47 AM, Jonas Henriksen wrote:

Well thanks, but that don't help me much.

I've tried setting an extra condition using datetime>(now() - '14
weeks'::interval)

explain analyze
SELECT max(date_time) FROM data_values
where date_time > (now() - '14 weeks'::interval)
GROUP BY data_logger_id;

HashAggregate  (cost=23264.52..23264.55 rows=2 width=12) (actual
time=1691.447..1691.454 rows=3 loops=1)
 ->  Bitmap Heap Scan on data_values  (cost=7922.08..21787.31
rows=295442 width=12) (actual time=320.643..951.043 rows=298589
loops=1)
       Recheck Cond: (date_time > (now() - '98 days'::interval))
       ->  Bitmap Index Scan on data_values_data_date_time_index
(cost=0.00..7848.22 rows=295442 width=0) (actual time=319.708..319.708
rows=298589 loops=1)
             Index Cond: (date_time > (now() - '98 days'::interval))
Total runtime: 1691.598 ms

However, when I switch to using datetime>(now() - '15 weeks'::interval) I get:
explain analyze
SELECT max(date_time) FROM data_values
where date_time > (now() - '15 weeks'::interval)
GROUP BY data_logger_id;

HashAggregate  (cost=23798.26..23798.28 rows=2 width=12) (actual
time=3237.816..3237.823 rows=3 loops=1)
 ->  Seq Scan on data_values  (cost=0.00..22084.62 rows=342728
width=12) (actual time=0.037..2409.234 rows=344111 loops=1)
       Filter: (date_time > (now() - '105 days'::interval))
Total runtime: 3237.944 ms

Doing "SET enable_seqscan=off" speeds up the query and forces the use
of the index, but I dont really love that solution...


regards Jonas:))




On 5/7/07, Peter Eisentraut <peter_e@xxxxxxx> wrote:
Am Montag, 7. Mai 2007 15:53 schrieb Jonas Henriksen:
> while if I add a GROUP BY data_logger the query uses a seq scan and a
>
> lot of time:
> >> explain analyze SELECT max(date_time) FROM data_values GROUP BY
> data_logger_id;

I don't think there is anything you can do about this.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


--
Jim Nasby                                            jim@xxxxxxxxx
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux