Could
you post an explain analyze of the query? Just FYI, if you do an explain
analyze of the insert statement, it will actually do the insert. If you
don't want that just post an explain analyze of the select
part.
To me
it would be interesting to compare just the select parts of the query between
Postgres and MSSQL. That way you would know if your Postgres install is
slower at the query or slower at the insert.
Hello People,
I'm trying to solve a 'what i feel is a'
performance/configuration/query error on my side. I'm fairly new to
configuring PostgreSQL so, i might be completely wrong with my configuration.
My database consists of 44 tables, about 20GB. Two of those tables are
'big/huge'. Table src.src_faktuur_verricht contains 43million records (9GB)
and table src.src_faktuur_verrsec contains 55million records (6GB).
Below is the 'slow' query.
INSERT INTO
rpt.rpt_verrichting (verrichting_id ,verrichting_secid ,fout_status ,patientnr ,verrichtingsdatum ,locatie_code ,afdeling_code ,uitvoerder_code ,aanvrager_code ,verrichting_code ,dbcnr ,aantal_uitgevoerd ,kostenplaats_code ,vc_patientnr ,vc_verrichting_code ,vc_dbcnr ) SELECT
t1.id ,
t0.secid ,
t1.status ,
t1.patientnr ,
t1.datum ,
t1.locatie ,
t1.afdeling ,
t1.uitvoerder ,
t1.aanvrager ,
t0.code ,
t1.casenr ,
t0.aantal ,
t0.kostplaats ,
null ,
null , null FROM
src.src_faktuur_verrsec t0 JOIN
src.src_faktuur_verricht t1 ON
t0.id = t1.id WHERE substr(t0.code,1,2) not in
('14','15','16','17') AND (substr(t0.correctie,4,1)
<> '1' OR t0.correctie is null) AND
EXTRACT(YEAR from t1.datum) > 2004;
Output from
explain
Hash Join (cost=1328360.12..6167462.76 rows=7197568
width=118) Hash Cond: (("outer".id)::text =
("inner".id)::text)
-> Seq Scan on src_faktuur_verrsec
t0 (cost=0.00..2773789.90 rows=40902852
width=52) Filter:
((substr((code)::text, 1, 2) <> '14'::text) AND (substr((code)::text, 1,
2) <> '15'::text) AND (substr((code)::text, 1, 2) <> '16'::text)
AND (substr((code)::text, 1, 2) <> '17'::text) AND
((substr((correctie)::text, 4, 1) <> '1'::text) OR (correctie IS
NULL))) -> Hash (cost=1188102.97..1188102.97
rows=8942863 width=80)
-> Bitmap Heap Scan on src_faktuur_verricht t1
(cost=62392.02..1188102.97 rows=8942863
width=80)
Recheck Cond: (date_part('year'::text, datum) > 2004::double
precision)
-> Bitmap Index Scan on src_faktuur_verricht_idx1
(cost=0.00..62392.02 rows=8942863
width=0)
Index Cond: (date_part('year'::text, datum) > 2004::double
precision)
The db server runs PostgreSQL 8.1.4 on FreeBSD
6.1-Stable. 2GB of RAM. It contains two SATA150 disks, one contains
PostgreSQL and the rest of the operating system and the other disk holds the
pg_xlog directory.
Changed lines from my postgresql.conf
file
shared_buffers = 8192 temp_buffers = 4096 work_mem =
65536 maintenance_work_mem = 1048576 max_fsm_pages = 40000 fsync =
off wal_buffers = 64 effective_cache_size = 174848
The query
above takes around 42 minutes.
However, i also have a wimpy desktop
machine with 1gb ram. Windows with MSSQL 2000 (default installation), same
database structure, same indexes, same query, etc and it takes 17 minutes. The
big difference makes me think that i've made an error with my PostgreSQL
configuration. I just can't seem to figure it out.
Could someone
perhaps give me some pointers, advice?
Thanks in advance.
Nicky
|