500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running

El-Lotso <el.lotso@xxxxxxxxx> · Wed, 12 Sep 2007 00:57:48 +0800

Hi,

appreciate if someone can have some pointers for this. 

PG.8.2.4 1.4G centrino(s/core) 1.5GB ram/5400rpm laptop HD

3 mail tables which has already been selected "out" into separate tables
(useing create table foo as select * from foo_main where x=y)

These test tables containing only a very small subset of the main data's
table (max 1k to 10k rows vs 1.5mill to 7mill rows in the main table)

table definitions and actual query are attached. (names has been altered
to protect the innocent)

I've played around with some tweaking of the postgres.conf setting per
guidance from jdavis (in irc) w/o much(any) improvement. Also tried
re-writing the queries to NOT use subselects (per depesz in irc also)
also yielded nothing spectacular.

The only thing I noticed was that when the subqueries combine more than
3 tables, then PG will choke. If only at 3 joined tables per subquery,
the results come out fast, even for 6K rows.

but if the subqueries (these subqueries by itself, executes fast and
returns results in 1 to 10secs) were done independently and then placed
into a temp table, and then finally joined together using a query such
as

select a.a,b.b,c.c from a inner join b on (x = x) left outer join c on(x
= y)

then it would also be fast

work_mem = 8MB / 32MB /128MB (32 MB default in my setup)
effective_Cache_size = 128MB/500MB (500 default)
shared_buffers = 200MB
geqo_threshold = 5 (default 12)
geqo_effort = 2 (default 5)
ramdom_page_cose = 8.0 (default 4)
maintenance_work_mem = 64MB
join_collapse_limit = 1/8/15  (8 default)
from_collapse_limit = 1/8/15 (8 default)
enable_nestloop = f (on by default)

based on current performance, even with a small number of rows in the
individual tables (max 20k), I can't even get a result out in 2 hours.
(> 3 tables joined per subquery) which is making me re-think of PG's
useful-ness.

BTW, I also tried 8.2.4 CVS_STABLE Branch
                         Table "test_db.d"
         Column          |            Type             | Modifiers
-------------------------+-----------------------------+-----------
 id                      | character varying(12)       | not null
 record_update_date_time | timestamp without time zone |
 code                    | character varying(5)        |
Indexes:
    "d_pkey" PRIMARY KEY, btree (id)
    "idx_d_upd" btree (record_update_date_time)

                       Table "test_db.trh"
         Column          |            Type             | Modifiers
-------------------------+-----------------------------+-----------
 unique_id               | character varying(35)       | not null
 id                      | character varying(12)       |
 hid                     | smallint                    |
 index_1                 | character varying(3)        |
 index_2                 | character varying(3)        |
 record_update_date_time | timestamp without time zone |
 start_timestamp         | timestamp without time zone |
 phase_id                | smallint                    |
 ttype                   | smallint                    |
 pber_2                  | double precision            |
 pber_3                  | double precision            |
 pber_4                  | double precision            |
Indexes:
    "trh_pkey" PRIMARY KEY, btree (unique_id)
    "idx_trh_id" btree (id)

                        Table "test_db.ts"
         Column          |            Type             | Modifiers
-------------------------+-----------------------------+-----------
 unique_id               | character varying(50)       | not null
 id                      | character varying(12)       |
 record_update_date_time | timestamp without time zone |
 start_timestamp         | timestamp without time zone |
 ttype                   | smallint                    |
 seq_time                | timestamp without time zone |
 seq_date                | timestamp without time zone |
 code                    | character varying(5)        |
Indexes:
    "ts_pkey" PRIMARY KEY, btree (unique_id)
    "idx_ts_id" btree (id)

                        Table "test_db.trd"
         Column          |            Type             | Modifiers
-------------------------+-----------------------------+-----------
 unique_id               | character varying(50)       | not null
 id                      | character varying(12)       |
 index_1                 | character varying(3)        |
 index_2                 | character varying(3)        |
 start_timestamp         | timestamp without time zone |
 record_update_date_time | timestamp without time zone |
 phase_id                | smallint                    |
 ttype                   | smallint                    |
 fval_2                  | smallint                    |
 fval_3                  | smallint                    |
 fval_4                  | smallint                    |
Indexes:
    "trd_pkey" PRIMARY KEY, btree (unique_id)
    "idx_td_id" btree (id)

explain analyse SELECT JOIN1.ID, JOIN1.HID, JOIN1.START_TIMESTAMP, JOIN1.PHASE_ID, JOIN1.CODE, JOIN1.LOG_PBER_2, JOIN2.LOG_PBER_3, JOIN3.LOG_PBER_4 FROM (
SELECT TRH.ID, TRH.HID, TRH.START_TIMESTAMP, TRH.PHASE_ID, TS.SEQ_DATE, TS.SEQ_TIME, TRH.TTYPE, D.CODE, 
CASE WHEN PBER_2 = 0 THEN 0 ELSE LOG(TRH.PBER_2) END AS LOG_PBER_2
FROM TRH INNER JOIN D
ON D.ID = TRH.ID
INNER JOIN  TS 
ON TS.ID = TRH.ID
AND TS.TTYPE = TRH.TTYPE
AND TS.START_TIMESTAMP = TRH.START_TIMESTAMP

/*
inner join trd
on trd.id = trh.id
and trd.ttype = trh.ttype
and trd.start_timestamp = trh.start_timestamp
*/
WHERE D.RECORD_UPDATE_DATE_TIME BETWEEN '8/20/2007' AND '9/8/2007' AND
TRH.TTYPE IN (35,75,703,740,764) AND
D.CODE IN ('HUA69') 
)
JOIN1 
LEFT OUTER JOIN (
SELECT TRH.ID, TRH.HID, TRH.START_TIMESTAMP, TRH.PHASE_ID, TS.SEQ_DATE, TS.SEQ_TIME, TRH.TTYPE, D.CODE, 
CASE WHEN PBER_3 = 0 THEN 0 ELSE LOG(TRH.PBER_3) END AS LOG_PBER_3
FROM TRH 
INNER JOIN D
ON D.ID = TRH.ID
INNER JOIN  TS 
ON TS.ID = TRH.ID AND
TS.TTYPE = TRH.TTYPE AND
TS.START_TIMESTAMP = TRH.START_TIMESTAMP 
/*
inner join trd
on trd.id = trh.id
and trd.ttype = trh.ttype
and trd.start_timestamp = trh.start_timestamp
*/
WHERE D.RECORD_UPDATE_DATE_TIME BETWEEN '8/20/2007' AND '9/8/2007' AND
TRH.TTYPE IN (69,178,198,704,757,741,765) AND
D.CODE IN ('HUA69') 
)
JOIN2 
ON JOIN2.ID = JOIN1.ID AND
JOIN2.HID = JOIN1.HID AND
JOIN2.SEQ_TIME = JOIN1.SEQ_TIME AND
JOIN2.SEQ_DATE = JOIN1.SEQ_DATE 
LEFT OUTER JOIN (SELECT TRH.ID,
TRH.HID,
TRH.START_TIMESTAMP,
TRH.PHASE_ID,
TS.SEQ_DATE,
TS.SEQ_TIME,
TRH.TTYPE,
D.CODE, 
CASE WHEN PBER_4 = 0 THEN 0 ELSE LOG(TRH.PBER_4) END AS LOG_PBER_4
FROM TRH 
INNER JOIN D
ON D.ID = TRH.ID
INNER JOIN  TS 
ON TS.ID = TRH.ID AND
TS.TTYPE = TRH.TTYPE AND
TS.START_TIMESTAMP = TRH.START_TIMESTAMP
/*
inner join trd
on trd.id = trh.id
and trd.ttype = trh.ttype
and trd.start_timestamp = trh.start_timestamp
*/
WHERE D.RECORD_UPDATE_DATE_TIME BETWEEN '8/20/2007' AND '9/8/2007' 
AND TRH.TTYPE IN (177,197,705,742,758,766) 
AND D.CODE IN ('HUA69') 

)
JOIN3 
ON JOIN3.ID = JOIN1.ID
AND JOIN3.HID = JOIN1.HID
AND JOIN3.SEQ_TIME = JOIN1.SEQ_TIME
AND JOIN3.SEQ_DATE = JOIN1.SEQ_DATE 

=========================2.5k results w/ 3 tables joined per subquery===============================
Hash Left Join  (cost=4673.17..6517.48 rows=3 width=59) (actual time=204.802..288.454 rows=2552 loops=1)
  Hash Cond: (((trh.id)::text = (join2.id)::text) AND (trh.hid = join2.hid) AND (ts.seq_time = join2.seq_time) AND (ts.seq_date = join2.seq_date))
  ->  Hash Left Join  (cost=2470.15..4314.39 rows=3 width=67) (actual time=111.784..183.319 rows=2552 loops=1)
        Hash Cond: (((trh.id)::text = (join3.id)::text) AND (trh.hid = join3.hid) AND (ts.seq_time = join3.seq_time) AND (ts.seq_date = join3.seq_date))
        ->  Hash Join  (cost=299.32..2143.51 rows=3 width=59) (actual time=16.166..76.762 rows=2552 loops=1)
              Hash Cond: (((trh.id)::text = (d.id)::text) AND (trh.ttype = ts.ttype) AND (trh.start_timestamp = ts.start_timestamp))
              ->  Seq Scan on trh  (cost=0.00..1742.67 rows=9021 width=36) (actual time=0.489..40.565 rows=8888 loops=1)
                    Filter: (ttype = ANY ('{35,75,703,740,764}'::integer[]))
              ->  Hash  (cost=284.78..284.78 rows=831 width=63) (actual time=15.620..15.620 rows=969 loops=1)
                    ->  Hash Join  (cost=37.86..284.78 rows=831 width=63) (actual time=1.332..13.315 rows=969 loops=1)
                          Hash Cond: ((ts.id)::text = (d.id)::text)
                          ->  Seq Scan on ts  (cost=0.00..226.44 rows=3244 width=40) (actual time=0.117..6.091 rows=3244 loops=1)
                          ->  Hash  (cost=35.56..35.56 rows=184 width=23) (actual time=1.195..1.195 rows=186 loops=1)
                                ->  Seq Scan on d  (cost=0.00..35.56 rows=184 width=23) (actual time=0.479..0.854 rows=186 loops=1)
                                      Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA69'::text))
        ->  Hash  (cost=2170.77..2170.77 rows=3 width=42) (actual time=95.550..95.550 rows=2488 loops=1)
              ->  Subquery Scan join3  (cost=299.32..2170.77 rows=3 width=42) (actual time=15.951..89.594 rows=2488 loops=1)
                    ->  Hash Join  (cost=299.32..2170.74 rows=3 width=61) (actual time=15.946..83.598 rows=2488 loops=1)
                          Hash Cond: (((trh.id)::text = (d.id)::text) AND (trh.ttype = ts.ttype) AND (trh.start_timestamp = ts.start_timestamp))
                          ->  Seq Scan on trh  (cost=0.00..1775.26 rows=8543 width=36) (actual time=0.416..44.637 rows=8624 loops=1)
                                Filter: (ttype = ANY ('{177,197,705,742,758,766}'::integer[]))
                          ->  Hash  (cost=284.78..284.78 rows=831 width=63) (actual time=15.472..15.472 rows=969 loops=1)
                                ->  Hash Join  (cost=37.86..284.78 rows=831 width=63) (actual time=1.311..13.160 rows=969 loops=1)
                                      Hash Cond: ((ts.id)::text = (d.id)::text)
                                      ->  Seq Scan on ts  (cost=0.00..226.44 rows=3244 width=40) (actual time=0.108..5.779 rows=3244 loops=1)
                                      ->  Hash  (cost=35.56..35.56 rows=184 width=23) (actual time=1.180..1.180 rows=186 loops=1)
                                            ->  Seq Scan on d  (cost=0.00..35.56 rows=184 width=23) (actual time=0.470..0.850 rows=186 loops=1)
                                                  Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA69'::text))
  ->  Hash  (cost=2202.96..2202.96 rows=3 width=42) (actual time=92.933..92.933 rows=2704 loops=1)
        ->  Subquery Scan join2  (cost=299.32..2202.96 rows=3 width=42) (actual time=16.333..86.181 rows=2704 loops=1)
              ->  Hash Join  (cost=299.32..2202.93 rows=3 width=61) (actual time=16.329..79.362 rows=2704 loops=1)
                    Hash Cond: (((trh.id)::text = (d.id)::text) AND (trh.ttype = ts.ttype) AND (trh.start_timestamp = ts.start_timestamp))
                    ->  Seq Scan on trh  (cost=0.00..1807.85 rows=8508 width=36) (actual time=0.438..41.233 rows=8560 loops=1)
                          Filter: (ttype = ANY ('{69,178,198,704,757,741,765}'::integer[]))
                    ->  Hash  (cost=284.78..284.78 rows=831 width=63) (actual time=15.840..15.840 rows=969 loops=1)
                          ->  Hash Join  (cost=37.86..284.78 rows=831 width=63) (actual time=1.482..13.557 rows=969 loops=1)
                                Hash Cond: ((ts.id)::text = (d.id)::text)
                                ->  Seq Scan on ts  (cost=0.00..226.44 rows=3244 width=40) (actual time=0.113..6.124 rows=3244 loops=1)
                                ->  Hash  (cost=35.56..35.56 rows=184 width=23) (actual time=1.346..1.346 rows=186 loops=1)
                                      ->  Seq Scan on d  (cost=0.00..35.56 rows=184 width=23) (actual time=0.621..0.997 rows=186 loops=1)
                                            Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA69'::text))
Total runtime: 291.763 ms

===============================

If I were to un-comment the join to table trd, the PG will choke and not come out with results for up to 20+mins (for the above 2.5K results sets retrieved) For 6K results, 2 hours and it's still crunching.

re-writing to NOT use subselects and also joining 4 tables will also choke

explain analyse select d.id, d.code, trh.hid, trh.start_timestamp, trh.phase_id, log(trh.pber_2) as log_pber_2, log(trh2.pber_3) as log_pber_3,log(trh3.pber_4) as log_pber_4,trd.fval_2, trd2.fval_3,trd3.fval_4
from 
d inner join trd on trd.id = d.id inner join (
trh inner join ts on (trh.id = ts.id and ts.ttype = trh.ttype and ts.start_timestamp = trh.start_timestamp)) 
on (d.id = trh.id and trh.id = trd.id and trd.ttype = trh.ttype and trd.start_timestamp = trh.start_timestamp and TRH.TTYPE IN (35,75,703,740,764))
left join (
trh trh2 
inner join ts ts2 
on (trh2.id = ts2.id and ts2.ttype = trh2.ttype and ts2.start_timestamp = trh2.start_timestamp) 
inner join trd trd2 
on trd2.id = trh2.id and trd2.ttype = trh2.ttype and trd2.start_timestamp = trh2.start_timestamp
)
on (d.id = trh2.id and ts2.seq_date = ts.seq_date and ts2.seq_time = ts.seq_time and trh2.hid=trh.hid and TRH2.TTYPE IN (69,178,198,704,757,741,765))
left join (
trh trh3 
inner join ts ts3 
on (trh3.id = ts3.id and ts3.ttype = trh3.ttype and ts3.start_timestamp = trh3.start_timestamp)
inner join trd trd3
on trd3.id = trh3.id and trd3.ttype = trh3.ttype and trd3.start_timestamp = trh3.start_timestamp
) 
on (d.id = trh3.id and ts3.seq_date = ts.seq_date and ts3.seq_time = ts.seq_time and trh3.hid=trh.hid and TRH3.TTYPE IN (177,197,705,742,758,766))
where d.code = 'HUA69' 

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match