I'm getting low performance on SUM and GROUP BY queries.
How can I improve my database to perform such queries.
Here is my table schema:
=> \d acct_2010_25
Tabela "public.acct_2010_25"
Coluna | Tipo | Modificadores
----------------+-----------------------------+------------------------------------------------------------------------
ip_src | inet | not null default '0.0.0.0'::inet
ip_dst | inet | not null default '0.0.0.0'::inet
as_src | bigint | not null default 0
as_dst | bigint | not null default 0
port_src | integer | not null default 0
port_dst | integer | not null default 0
tcp_flags | smallint | not null default 0
ip_proto | smallint | not null default 0
packets | integer | not null
flows | integer | not null default 0
bytes | bigint | not null
stamp_inserted | timestamp without time zone | not null default '0001-01-01 00:00:00 BC'::timestamp without time zone
stamp_updated | timestamp without time zone |
Índices:
"acct_2010_25_pk" PRIMARY KEY, btree (stamp_inserted, ip_src, ip_dst, port_src, port_dst, ip_proto)
"ibytes_acct_2010_25" btree (bytes)
Here is my one query example (could add pk to flow and packet fields):
=> explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto,SUM("bytes"),SUM("packets"),SUM("flows") FROM "acct_2010_25" WHERE "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto order by SUM(bytes) desc LIMIT 50 OFFSET 0;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3998662.81..3998662.94 rows=50 width=50) (actual time=276981.107..276981.133 rows=50 loops=1)
-> Sort (cost=3998662.81..4001046.07 rows=953305 width=50) (actual time=276981.105..276981.107 rows=50 loops=1)
Sort Key: sum(bytes)
-> GroupAggregate (cost=3499863.27..3754872.33 rows=953305 width=50) (actual time=165468.257..182677.580 rows=8182616 loops=1)
-> Sort (cost=3499863.27..3523695.89 rows=9533049 width=50) (actual time=165468.022..168908.828 rows=9494165 loops=1)
Sort Key: ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto
-> Seq Scan on acct_2010_25 (cost=0.00..352648.10 rows=9533049 width=50) (actual time=0.038..50860.391 rows=9494165 loops=1)
Filter: ((stamp_inserted >= '2010-06-20 10:10:00'::timestamp without time zone) AND (stamp_inserted < '2010-06-21 10:10:00'::timestamp without time zone))
Total runtime: 278791.661 ms
(9 registros)
Another one just summing bytes (still low):
=> explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto,SUM("bytes") FROM "acct_2010_25" WHERE "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto LIMIT 50 OFFSET 0;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3395202.50..3395213.12 rows=50 width=42) (actual time=106261.359..106261.451 rows=50 loops=1)
-> GroupAggregate (cost=3395202.50..3602225.48 rows=974226 width=42) (actual time=106261.357..106261.435 rows=50 loops=1)
-> Sort (cost=3395202.50..3419558.14 rows=9742258 width=42) (actual time=106261.107..106261.169 rows=176 loops=1)
Sort Key: ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto
-> Seq Scan on acct_2010_25 (cost=0.00..367529.72 rows=9742258 width=42) (actual time=0.073..8058.598 rows=9494165 loops=1)
Filter: ((stamp_inserted >= '2010-06-20 10:10:00'::timestamp without time zone) AND (stamp_inserted < '2010-06-21 10:10:00'::timestamp without time zone))
Total runtime: 109911.882 ms
(7 registros)
The server has 2 Intel(R) Xeon(R) CPU E5430 @ 2.66GHz and 16GB RAM.
I'm using PostgreSQL 8.1.18 default config from Centos 5.5 (just increased checkpoint_segments to 50).
What can I change to increase performance?
Thanks in advance.
Cheers.
Sergio Roberto Charpinel Jr.