PostgreSQL runs a query much slower than BDE and MySQL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm in the process of migrating a Paradox 7/BDE 5.01 database from single-user 
Paradox to a web based interface to either MySQL or PostgreSQL.
The database is a pedigree sheep breed society database recording sheep and 
flocks (amongst other things).

My current problem is with one table and an associated query which takes 10 
times longer to execute on PostgreSQL than BDE, which in turn takes 10 times 
longer than MySQL. The table links sheep to flocks and is created as follows:

CREATE TABLE SHEEP_FLOCK
(
  regn_no varchar(7) NOT NULL,
  flock_no varchar(6) NOT NULL,
  transfer_date date NOT NULL,
  last_changed date NOT NULL,
  CONSTRAINT SHEEP_FLOCK_pkey PRIMARY KEY (regn_no, flock_no, 
transfer_date)
) 
WITHOUT OIDS;
ALTER TABLE SHEEP_FLOCK OWNER TO postgres;

I then populate the table with 

COPY SHEEP_FLOCK
FROM 'e:/ssbg/devt/devt/export_data/sheep_flock.txt'
WITH CSV HEADER

The table then has about 82000 records

The query I run is:

/* Select all sheep who's most recent transfer was into the subject flock */
SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
FROM SHEEP_FLOCK f1 JOIN 
    /* The last transfer date for each sheep */
    (SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
    FROM  SHEEP_FLOCK f
    GROUP BY f.regn_no) f2 
ON f1.regn_no = f2.regn_no
WHERE f1.flock_no = '1359'
AND f1.transfer_date = f2.last_xfer_date

The sub-select on it's own returns about 32000 rows.

Using identically structured tables and the same primary key, if I run this on 
Paradox/BDE it takes about 120ms, on MySQL (5.0.24, local server) about 3ms, 
and on PostgresSQL (8.1.3, local server) about 1290ms). All on the same 
Windows XP Pro machine with 512MB ram of which nearly half is free.  

The query plan shows most of the time is spent sorting the 30000+ rows from the subquery, so I added a further
subquery as follows: 

/* Select all sheep who's most recent transfer was into the subject flock */
SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
FROM SHEEP_FLOCK f1 JOIN 
    /* The last transfer date for each sheep */
    (SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
    FROM  SHEEP_FLOCK f
    WHERE f.regn_no IN 
        /* Limit the rows extracted by the outer sub-query to those relevant to the 
subject flock */
	/* This typically reduces the time from 1297ms to 47ms - from 35000 rows 
to 127 rows */
	(SELECT s.regn_no FROM SHEEP_FLOCK s where s.flock_no = '1359')
    GROUP BY f.regn_no) f2 
ON f1.regn_no = f2.regn_no
WHERE f1.flock_no = '1359'
AND f1.transfer_date = f2.last_xfer_date

then as the comment suggests I get a considerable improvement, but it's still an 
order of magnitude slower than MySQL.

Can anyone suggest why PostgreSQL performs the original query so much slower than even BDE?
 -- 
Peter Hardman
Acre Cottage, Horsebridge
King's Somborne
Stockbridge
SO20 6PT

== Breeder of Shetland Cattle and Shetland Sheep ==



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux