Re: Speeding up query, Joining 55mil and 43mil records.

"Dave Dutcher" <dave@xxxxxxxxxxxx> · Wed, 21 Jun 2006 09:53:03 -0500

Title: Message

Could 
you post an explain analyze of the query?  Just FYI, if you do an explain 
analyze of the insert statement, it will actually do the insert.  If you 
don't want that just post an explain analyze of the select 
part.

To me 
it would be interesting to compare just the select parts of the query between 
Postgres and MSSQL.  That way you would know if your Postgres install is 
slower at the query or slower at the insert.

  -----Original Message-----
From: 
  pgsql-performance-owner@xxxxxxxxxxxxxx 
  [mailto:pgsql-performance-owner@xxxxxxxxxxxxxx] On Behalf Of 
  nicky
Sent: Wednesday, June 21, 2006 8:47 AM
To: 
  pgsql-performance@xxxxxxxxxxxxxx
Subject: [PERFORM] Speeding up 
  query, Joining 55mil and 43mil records. 

Hello People, 

I'm trying to solve a 'what i feel is a' 
  performance/configuration/query error on my side. I'm fairly new to 
  configuring PostgreSQL so, i might be completely wrong with my configuration. 

My database consists of 44 tables, about 20GB. Two of those tables are 
  'big/huge'. Table src.src_faktuur_verricht contains 43million records (9GB) 
  and table src.src_faktuur_verrsec contains 55million records (6GB). 

Below is the 'slow' query. 

INSERT INTO 
  rpt.rpt_verrichting
(verrichting_id
,verrichting_secid
,fout_status
,patientnr
,verrichtingsdatum
,locatie_code
,afdeling_code
,uitvoerder_code
,aanvrager_code
,verrichting_code
,dbcnr
,aantal_uitgevoerd
,kostenplaats_code
,vc_patientnr
,vc_verrichting_code
,vc_dbcnr
)
SELECT  
  t1.id
,       
  t0.secid
,       
  t1.status
,       
  t1.patientnr
,       
  t1.datum
,       
  t1.locatie
,       
  t1.afdeling
,       
  t1.uitvoerder
,       
  t1.aanvrager
,       
  t0.code
,       
  t1.casenr
,       
  t0.aantal
,       
  t0.kostplaats
,       
  null
,       
  null
,       null
FROM    
  src.src_faktuur_verrsec t0 JOIN

  src.src_faktuur_verricht t1 ON

  t0.id = t1.id
WHERE   substr(t0.code,1,2) not in 
  ('14','15','16','17')
AND     (substr(t0.correctie,4,1) 
  <> '1' OR t0.correctie is null)
AND     
  EXTRACT(YEAR from t1.datum) > 2004;

Output from 
  explain

Hash Join  (cost=1328360.12..6167462.76 rows=7197568 
  width=118)
  Hash Cond: (("outer".id)::text = 
  ("inner".id)::text)

  ->  Seq Scan on src_faktuur_verrsec 
  t0  (cost=0.00..2773789.90 rows=40902852 
  width=52)
        Filter: 
  ((substr((code)::text, 1, 2) <> '14'::text) AND (substr((code)::text, 1, 
  2) <> '15'::text) AND (substr((code)::text, 1, 2) <> '16'::text) 
  AND (substr((code)::text, 1, 2) <> '17'::text) AND 
  ((substr((correctie)::text, 4, 1) <> '1'::text) OR (correctie IS 
  NULL)))
  ->  Hash  (cost=1188102.97..1188102.97 
  rows=8942863 width=80)

  ->  Bitmap Heap Scan on src_faktuur_verricht t1  
  (cost=62392.02..1188102.97 rows=8942863 
  width=80)

  Recheck Cond: (date_part('year'::text, datum) > 2004::double 
  precision)

  ->  Bitmap Index Scan on src_faktuur_verricht_idx1  
  (cost=0.00..62392.02 rows=8942863 
  width=0)

  Index Cond: (date_part('year'::text, datum) > 2004::double 
  precision)

The db server runs PostgreSQL 8.1.4 on FreeBSD 
  6.1-Stable. 2GB of RAM. 
It contains two SATA150 disks, one contains 
  PostgreSQL and the rest of the operating system and the other disk holds the 
  pg_xlog directory.

Changed lines from my postgresql.conf 
  file

shared_buffers = 8192
temp_buffers = 4096
work_mem = 
  65536
maintenance_work_mem = 1048576
max_fsm_pages = 40000
fsync = 
  off
wal_buffers = 64
effective_cache_size = 174848

The query 
  above takes around 42 minutes. 

However, i also have a wimpy desktop 
  machine with 1gb ram. Windows with MSSQL 2000 (default installation), same 
  database structure, same indexes, same query, etc and it takes 17 minutes. The 
  big difference makes me think that i've made an error with my PostgreSQL 
  configuration. I just can't seem to figure it out. 

Could someone 
  perhaps give me some pointers, advice?

Thanks in advance. 

Nicky