Re: help with too slow query

"Albe Laurenz" <laurenz.albe@xxxxxxxxxx> · Mon, 5 Nov 2012 11:52:38 +0100

Pedro Jiménez Pérez wrote:
> Sent: Friday, November 02, 2012 1:14 PM
> To: pgsql-performance@xxxxxxxxxxxxxx
> Subject:  help with too slow query
> 
> Hello,
>   I have this table definition:
> CREATE TABLE ism_floatvalues
> (
>   id_signal bigint NOT NULL, -- Indica la señal a la que pertenece este valor. Clave foránea que
> referencia al campo id_signal de la tabla ism_signal.
>   time_stamp timestamp without time zone NOT NULL, -- Marca de tiempo que indica fecha y hora
> correpondiente a este dato. Junto con id_signal forma la clave primaria de esta tabla
>   var_value double precision, -- Almacena el valor concreto de la señal en la marca de tiempo
> espeficicada.
>   CONSTRAINT ism_floatvalues_id_signal_fkey FOREIGN KEY (id_signal)
>       REFERENCES ism_signal (id_signal) MATCH SIMPLE
>       ON UPDATE NO ACTION ON DELETE CASCADE
> )
> WITH (
>   OIDS=FALSE
> );
> 
> CREATE INDEX ism_floatvalues_index_idsignal_timestamp
>   ON ism_floatvalues
>   USING btree
>   (id_signal, time_stamp DESC);
> 
> 
> 
> *********************************************
> 
> Then I run this query....
> *********************************************
> EXPLAIN analyze
> select round(CAST(sum(var_value) AS numeric),2) as var_value, date_trunc('month', time_stamp) as
> time_stamp , date_part('month',date_trunc('month', time_stamp)) as month,
> date_part('year',date_trunc('year', time_stamp)) as year from ism_floatvalues where id_signal in
> (
> select id_signal from ism_signal where reference = 'EDCA' and id_source in
> (
> select id_source from ism_installation where id_installation in
> (select id_installation from ism_groupxinstallation where id_group = 101)
> )
> )
> and time_stamp > date_trunc('month', current_date - interval '11 months')
> group by date_trunc('month', time_stamp), month, year
> order by time_stamp
> 
> ******************************
> And this is the result:
> ******************************
> 
> "GroupAggregate  (cost=4766541.62..4884678.62 rows=39483 width=16) (actual time=1302542.073..1302713.154 rows=10 loops=1)"
[...]
> "        ->  Hash Join  (cost=545.65..3203518.39 rows=9356201 width=16) (actual time=458941.090..1302245.307 rows=9741 loops=1)"
> "              Hash Cond: (ism_floatvalues.id_signal = ism_signal.id_signal)"
> "              ->  Seq Scan on ism_floatvalues  (cost=0.00..2965077.57 rows=28817098 width=24) (actual time=453907.600..1002381.652 rows=29114105 loops=1)"
> "                    Filter: (time_stamp > date_trunc('month'::text, (('now'::text)::date - '11 mons'::interval)))"
[...]

> This query is very slow as you can see, it took about 20 minutos to complete.... Can someone help me
> to improve performance on this query??
> Regards.

This sequential scan takes the lion share of the time.

Are the 29 million rows selected in that scan a significant percentage
of the total rows?  If yes, then the sequential scan is the
most efficient way to get the result, and the only remedy is to get
faster I/O or to cache more of the table in RAM.

If the query needs to access a lot of rows to complete, it must
be slow.

Yours,
Laurenz Albe

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance