sorry.. I sent this as I was about to go to bed and the explain analyse of the query w/ 4 tables joined per subquery came out. So.. attaching it.. On Wed, 2007-09-12 at 00:57 +0800, El-Lotso wrote: > Hi, > > appreciate if someone can have some pointers for this. > > PG.8.2.4 1.4G centrino(s/core) 1.5GB ram/5400rpm laptop HD > > 3 mail tables which has already been selected "out" into separate tables > (useing create table foo as select * from foo_main where x=y) > > These test tables containing only a very small subset of the main data's > table (max 1k to 10k rows vs 1.5mill to 7mill rows in the main table) > > table definitions and actual query are attached. (names has been altered > to protect the innocent) > > I've played around with some tweaking of the postgres.conf setting per > guidance from jdavis (in irc) w/o much(any) improvement. Also tried > re-writing the queries to NOT use subselects (per depesz in irc also) > also yielded nothing spectacular. > > The only thing I noticed was that when the subqueries combine more than > 3 tables, then PG will choke. If only at 3 joined tables per subquery, > the results come out fast, even for 6K rows. > > but if the subqueries (these subqueries by itself, executes fast and > returns results in 1 to 10secs) were done independently and then placed > into a temp table, and then finally joined together using a query such > as > > select a.a,b.b,c.c from a inner join b on (x = x) left outer join c on(x > = y) > > then it would also be fast > > work_mem = 8MB / 32MB /128MB (32 MB default in my setup) > effective_Cache_size = 128MB/500MB (500 default) > shared_buffers = 200MB > geqo_threshold = 5 (default 12) > geqo_effort = 2 (default 5) > ramdom_page_cose = 8.0 (default 4) > maintenance_work_mem = 64MB > join_collapse_limit = 1/8/15 (8 default) > from_collapse_limit = 1/8/15 (8 default) > enable_nestloop = f (on by default) > > based on current performance, even with a small number of rows in the > individual tables (max 20k), I can't even get a result out in 2 hours. > (> 3 tables joined per subquery) which is making me re-think of PG's > useful-ness. > > > > BTW, I also tried 8.2.4 CVS_STABLE Branch
Table "test_db.d" Column | Type | Modifiers -------------------------+-----------------------------+----------- id | character varying(12) | not null record_update_date_time | timestamp without time zone | code | character varying(5) | Indexes: "d_pkey" PRIMARY KEY, btree (id) "idx_d_upd" btree (record_update_date_time) Table "test_db.trh" Column | Type | Modifiers -------------------------+-----------------------------+----------- unique_id | character varying(35) | not null id | character varying(12) | hid | smallint | index_1 | character varying(3) | index_2 | character varying(3) | record_update_date_time | timestamp without time zone | start_timestamp | timestamp without time zone | phase_id | smallint | ttype | smallint | pber_2 | double precision | pber_3 | double precision | pber_4 | double precision | Indexes: "trh_pkey" PRIMARY KEY, btree (unique_id) "idx_trh_id" btree (id) Table "test_db.ts" Column | Type | Modifiers -------------------------+-----------------------------+----------- unique_id | character varying(50) | not null id | character varying(12) | record_update_date_time | timestamp without time zone | start_timestamp | timestamp without time zone | ttype | smallint | seq_time | timestamp without time zone | seq_date | timestamp without time zone | code | character varying(5) | Indexes: "ts_pkey" PRIMARY KEY, btree (unique_id) "idx_ts_id" btree (id) Table "test_db.trd" Column | Type | Modifiers -------------------------+-----------------------------+----------- unique_id | character varying(50) | not null id | character varying(12) | index_1 | character varying(3) | index_2 | character varying(3) | start_timestamp | timestamp without time zone | record_update_date_time | timestamp without time zone | phase_id | smallint | ttype | smallint | fval_2 | smallint | fval_3 | smallint | fval_4 | smallint | Indexes: "trd_pkey" PRIMARY KEY, btree (unique_id) "idx_td_id" btree (id) explain analyse SELECT JOIN1.ID, JOIN1.HID, JOIN1.START_TIMESTAMP, JOIN1.PHASE_ID, JOIN1.CODE, JOIN1.LOG_PBER_2, JOIN2.LOG_PBER_3, JOIN3.LOG_PBER_4 FROM ( SELECT TRH.ID, TRH.HID, TRH.START_TIMESTAMP, TRH.PHASE_ID, TS.SEQ_DATE, TS.SEQ_TIME, TRH.TTYPE, D.CODE, CASE WHEN PBER_2 = 0 THEN 0 ELSE LOG(TRH.PBER_2) END AS LOG_PBER_2 FROM TRH INNER JOIN D ON D.ID = TRH.ID INNER JOIN TS ON TS.ID = TRH.ID AND TS.TTYPE = TRH.TTYPE AND TS.START_TIMESTAMP = TRH.START_TIMESTAMP /* inner join trd on trd.id = trh.id and trd.ttype = trh.ttype and trd.start_timestamp = trh.start_timestamp */ WHERE D.RECORD_UPDATE_DATE_TIME BETWEEN '8/20/2007' AND '9/8/2007' AND TRH.TTYPE IN (35,75,703,740,764) AND D.CODE IN ('HUA69') ) JOIN1 LEFT OUTER JOIN ( SELECT TRH.ID, TRH.HID, TRH.START_TIMESTAMP, TRH.PHASE_ID, TS.SEQ_DATE, TS.SEQ_TIME, TRH.TTYPE, D.CODE, CASE WHEN PBER_3 = 0 THEN 0 ELSE LOG(TRH.PBER_3) END AS LOG_PBER_3 FROM TRH INNER JOIN D ON D.ID = TRH.ID INNER JOIN TS ON TS.ID = TRH.ID AND TS.TTYPE = TRH.TTYPE AND TS.START_TIMESTAMP = TRH.START_TIMESTAMP /* inner join trd on trd.id = trh.id and trd.ttype = trh.ttype and trd.start_timestamp = trh.start_timestamp */ WHERE D.RECORD_UPDATE_DATE_TIME BETWEEN '8/20/2007' AND '9/8/2007' AND TRH.TTYPE IN (69,178,198,704,757,741,765) AND D.CODE IN ('HUA69') ) JOIN2 ON JOIN2.ID = JOIN1.ID AND JOIN2.HID = JOIN1.HID AND JOIN2.SEQ_TIME = JOIN1.SEQ_TIME AND JOIN2.SEQ_DATE = JOIN1.SEQ_DATE LEFT OUTER JOIN (SELECT TRH.ID, TRH.HID, TRH.START_TIMESTAMP, TRH.PHASE_ID, TS.SEQ_DATE, TS.SEQ_TIME, TRH.TTYPE, D.CODE, CASE WHEN PBER_4 = 0 THEN 0 ELSE LOG(TRH.PBER_4) END AS LOG_PBER_4 FROM TRH INNER JOIN D ON D.ID = TRH.ID INNER JOIN TS ON TS.ID = TRH.ID AND TS.TTYPE = TRH.TTYPE AND TS.START_TIMESTAMP = TRH.START_TIMESTAMP /* inner join trd on trd.id = trh.id and trd.ttype = trh.ttype and trd.start_timestamp = trh.start_timestamp */ WHERE D.RECORD_UPDATE_DATE_TIME BETWEEN '8/20/2007' AND '9/8/2007' AND TRH.TTYPE IN (177,197,705,742,758,766) AND D.CODE IN ('HUA69') ) JOIN3 ON JOIN3.ID = JOIN1.ID AND JOIN3.HID = JOIN1.HID AND JOIN3.SEQ_TIME = JOIN1.SEQ_TIME AND JOIN3.SEQ_DATE = JOIN1.SEQ_DATE =========================2.5k results w/ 3 tables joined per subquery=============================== Hash Left Join (cost=4673.17..6517.48 rows=3 width=59) (actual time=204.802..288.454 rows=2552 loops=1) Hash Cond: (((trh.id)::text = (join2.id)::text) AND (trh.hid = join2.hid) AND (ts.seq_time = join2.seq_time) AND (ts.seq_date = join2.seq_date)) -> Hash Left Join (cost=2470.15..4314.39 rows=3 width=67) (actual time=111.784..183.319 rows=2552 loops=1) Hash Cond: (((trh.id)::text = (join3.id)::text) AND (trh.hid = join3.hid) AND (ts.seq_time = join3.seq_time) AND (ts.seq_date = join3.seq_date)) -> Hash Join (cost=299.32..2143.51 rows=3 width=59) (actual time=16.166..76.762 rows=2552 loops=1) Hash Cond: (((trh.id)::text = (d.id)::text) AND (trh.ttype = ts.ttype) AND (trh.start_timestamp = ts.start_timestamp)) -> Seq Scan on trh (cost=0.00..1742.67 rows=9021 width=36) (actual time=0.489..40.565 rows=8888 loops=1) Filter: (ttype = ANY ('{35,75,703,740,764}'::integer[])) -> Hash (cost=284.78..284.78 rows=831 width=63) (actual time=15.620..15.620 rows=969 loops=1) -> Hash Join (cost=37.86..284.78 rows=831 width=63) (actual time=1.332..13.315 rows=969 loops=1) Hash Cond: ((ts.id)::text = (d.id)::text) -> Seq Scan on ts (cost=0.00..226.44 rows=3244 width=40) (actual time=0.117..6.091 rows=3244 loops=1) -> Hash (cost=35.56..35.56 rows=184 width=23) (actual time=1.195..1.195 rows=186 loops=1) -> Seq Scan on d (cost=0.00..35.56 rows=184 width=23) (actual time=0.479..0.854 rows=186 loops=1) Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA69'::text)) -> Hash (cost=2170.77..2170.77 rows=3 width=42) (actual time=95.550..95.550 rows=2488 loops=1) -> Subquery Scan join3 (cost=299.32..2170.77 rows=3 width=42) (actual time=15.951..89.594 rows=2488 loops=1) -> Hash Join (cost=299.32..2170.74 rows=3 width=61) (actual time=15.946..83.598 rows=2488 loops=1) Hash Cond: (((trh.id)::text = (d.id)::text) AND (trh.ttype = ts.ttype) AND (trh.start_timestamp = ts.start_timestamp)) -> Seq Scan on trh (cost=0.00..1775.26 rows=8543 width=36) (actual time=0.416..44.637 rows=8624 loops=1) Filter: (ttype = ANY ('{177,197,705,742,758,766}'::integer[])) -> Hash (cost=284.78..284.78 rows=831 width=63) (actual time=15.472..15.472 rows=969 loops=1) -> Hash Join (cost=37.86..284.78 rows=831 width=63) (actual time=1.311..13.160 rows=969 loops=1) Hash Cond: ((ts.id)::text = (d.id)::text) -> Seq Scan on ts (cost=0.00..226.44 rows=3244 width=40) (actual time=0.108..5.779 rows=3244 loops=1) -> Hash (cost=35.56..35.56 rows=184 width=23) (actual time=1.180..1.180 rows=186 loops=1) -> Seq Scan on d (cost=0.00..35.56 rows=184 width=23) (actual time=0.470..0.850 rows=186 loops=1) Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA69'::text)) -> Hash (cost=2202.96..2202.96 rows=3 width=42) (actual time=92.933..92.933 rows=2704 loops=1) -> Subquery Scan join2 (cost=299.32..2202.96 rows=3 width=42) (actual time=16.333..86.181 rows=2704 loops=1) -> Hash Join (cost=299.32..2202.93 rows=3 width=61) (actual time=16.329..79.362 rows=2704 loops=1) Hash Cond: (((trh.id)::text = (d.id)::text) AND (trh.ttype = ts.ttype) AND (trh.start_timestamp = ts.start_timestamp)) -> Seq Scan on trh (cost=0.00..1807.85 rows=8508 width=36) (actual time=0.438..41.233 rows=8560 loops=1) Filter: (ttype = ANY ('{69,178,198,704,757,741,765}'::integer[])) -> Hash (cost=284.78..284.78 rows=831 width=63) (actual time=15.840..15.840 rows=969 loops=1) -> Hash Join (cost=37.86..284.78 rows=831 width=63) (actual time=1.482..13.557 rows=969 loops=1) Hash Cond: ((ts.id)::text = (d.id)::text) -> Seq Scan on ts (cost=0.00..226.44 rows=3244 width=40) (actual time=0.113..6.124 rows=3244 loops=1) -> Hash (cost=35.56..35.56 rows=184 width=23) (actual time=1.346..1.346 rows=186 loops=1) -> Seq Scan on d (cost=0.00..35.56 rows=184 width=23) (actual time=0.621..0.997 rows=186 loops=1) Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA69'::text)) Total runtime: 291.763 ms =============================== ============ SAME QUERY as ABOVE, but 4 tables joined per subquery -Uncommented the trd table join ====================== Nested Loop Left Join (cost=762.61..2002.16 rows=1 width=59) (actual time=46.998..2120307.759 rows=2552 loops=1) Join Filter: (((join2.id)::text = (test_db.trh.id)::text) AND (join2.hid = test_db.trh.hid) AND (join2.seq_time = test_db.ts.seq_time) AND (join2.seq_date = test_db.ts.seq_date)) -> Nested Loop Left Join (cost=508.41..1334.71 rows=1 width=67) (actual time=31.007..1058040.141 rows=2552 loops=1) Join Filter: (((join3.id)::text = (test_db.trh.id)::text) AND (join3.hid = test_db.trh.hid) AND (join3.seq_time = test_db.ts.seq_time) AND (join3.seq_date = test_db.ts.seq_date)) -> Nested Loop (cost=254.20..667.31 rows=1 width=59) (actual time=15.402..221.441 rows=2552 loops=1) Join Filter: ((test_db.trd.ttype = test_db.trh.ttype) AND (test_db.trd.start_timestamp = test_db.trh.start_timestamp)) -> Hash Join (cost=249.61..512.56 rows=1 width=87) (actual time=15.139..32.858 rows=969 loops=1) Hash Cond: (((test_db.ts.id)::text = (test_db.d.id)::text) AND (test_db.ts.start_timestamp = test_db.trd.start_timestamp) AND (test_db.ts.ttype = test_db.trd.ttype)) -> Seq Scan on ts (cost=0.00..226.44 rows=3244 width=40) (actual time=0.135..6.916 rows=3244 loops=1) -> Hash (cost=235.00..235.00 rows=835 width=47) (actual time=14.933..14.933 rows=1016 loops=1) -> Hash Join (cost=37.86..235.00 rows=835 width=47) (actual time=1.301..12.759 rows=1016 loops=1) Hash Cond: ((test_db.trd.id)::text = (test_db.d.id)::text) -> Seq Scan on trd (cost=0.00..176.57 rows=3257 width=24) (actual time=0.046..4.912 rows=3257 loops=1) -> Hash (cost=35.56..35.56 rows=184 width=23) (actual time=1.230..1.230 rows=186 loops=1) -> Seq Scan on d (cost=0.00..35.56 rows=184 width=23) (actual time=0.507..0.881 rows=186 loops=1) Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA69'::text)) -> Bitmap Heap Scan on trh (cost=4.59..154.00 rows=43 width=36) (actual time=0.041..0.146 rows=16 loops=969) Recheck Cond: ((test_db.d.id)::text = (test_db.trh.id)::text) Filter: (ttype = ANY ('{35,75,703,740,764}'::integer[])) -> Bitmap Index Scan on idx_trh_id (cost=0.00..4.59 rows=43 width=0) (actual time=0.030..0.030 rows=47 loops=969) Index Cond: ((test_db.d.id)::text = (test_db.trh.id)::text) -> Nested Loop (cost=254.20..667.37 rows=1 width=61) (actual time=1.529..402.739 rows=2488 loops=2552) Join Filter: ((test_db.trd.ttype = test_db.trh.ttype) AND (test_db.trd.start_timestamp = test_db.trh.start_timestamp)) -> Hash Join (cost=249.61..512.56 rows=1 width=87) (actual time=0.349..50.926 rows=969 loops=2552) Hash Cond: (((test_db.ts.id)::text = (test_db.d.id)::text) AND (test_db.ts.start_timestamp = test_db.trd.start_timestamp) AND (test_db.ts.ttype = test_db.trd.ttype)) -> Seq Scan on ts (cost=0.00..226.44 rows=3244 width=40) (actual time=0.336..23.042 rows=3244 loops=2552) -> Hash (cost=235.00..235.00 rows=835 width=47) (actual time=15.000..15.000 rows=1016 loops=1) -> Hash Join (cost=37.86..235.00 rows=835 width=47) (actual time=1.540..12.818 rows=1016 loops=1) Hash Cond: ((test_db.trd.id)::text = (test_db.d.id)::text) -> Seq Scan on trd (cost=0.00..176.57 rows=3257 width=24) (actual time=0.040..4.646 rows=3257 loops=1) -> Hash (cost=35.56..35.56 rows=184 width=23) (actual time=1.474..1.474 rows=186 loops=1) -> Seq Scan on d (cost=0.00..35.56 rows=184 width=23) (actual time=0.475..1.137 rows=186 loops=1) Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA69'::text)) -> Bitmap Heap Scan on trh (cost=4.59..154.05 rows=43 width=36) (actual time=0.098..0.260 rows=16 loops=2472888) Recheck Cond: ((test_db.d.id)::text = (test_db.trh.id)::text) Filter: (ttype = ANY ('{177,197,705,742,758,766}'::integer[])) -> Bitmap Index Scan on idx_trh_id (cost=0.00..4.59 rows=43 width=0) (actual time=0.058..0.058 rows=47 loops=2472888) Index Cond: ((test_db.d.id)::text = (test_db.trh.id)::text) -> Nested Loop (cost=254.20..667.42 rows=1 width=61) (actual time=0.801..403.268 rows=2704 loops=2552) Join Filter: ((test_db.trd.ttype = test_db.trh.ttype) AND (test_db.trd.start_timestamp = test_db.trh.start_timestamp)) -> Hash Join (cost=249.61..512.56 rows=1 width=87) (actual time=0.526..46.203 rows=969 loops=2552) Hash Cond: (((test_db.ts.id)::text = (test_db.d.id)::text) AND (test_db.ts.start_timestamp = test_db.trd.start_timestamp) AND (test_db.ts.ttype = test_db.trd.ttype)) -> Seq Scan on ts (cost=0.00..226.44 rows=3244 width=40) (actual time=0.511..20.517 rows=3244 loops=2552) -> Hash (cost=235.00..235.00 rows=835 width=47) (actual time=15.262..15.262 rows=1016 loops=1) -> Hash Join (cost=37.86..235.00 rows=835 width=47) (actual time=1.268..12.653 rows=1016 loops=1) Hash Cond: ((test_db.trd.id)::text = (test_db.d.id)::text) -> Seq Scan on trd (cost=0.00..176.57 rows=3257 width=24) (actual time=0.039..4.664 rows=3257 loops=1) -> Hash (cost=35.56..35.56 rows=184 width=23) (actual time=1.205..1.205 rows=186 loops=1) -> Seq Scan on d (cost=0.00..35.56 rows=184 width=23) (actual time=0.477..0.875 rows=186 loops=1) Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA69'::text)) -> Bitmap Heap Scan on trh (cost=4.59..154.11 rows=43 width=36) (actual time=0.124..0.266 rows=16 loops=2472888) Recheck Cond: ((test_db.d.id)::text = (test_db.trh.id)::text) Filter: (ttype = ANY ('{69,178,198,704,757,741,765}'::integer[])) -> Bitmap Index Scan on idx_trh_id (cost=0.00..4.59 rows=43 width=0) (actual time=0.064..0.064 rows=47 loops=2472888) Index Cond: ((test_db.d.id)::text = (test_db.trh.id)::text) Total runtime: 2120315.804 ms ==================== If I were to un-comment the join to table trd, the PG will choke and not come out with results for up to 20+mins (for the above 2.5K results sets retrieved) For 6K results, 2 hours and it's still crunching. re-writing to NOT use subselects and also joining 4 tables will also choke explain analyse select d.id, d.code, trh.hid, trh.start_timestamp, trh.phase_id, log(trh.pber_2) as log_pber_2, log(trh2.pber_3) as log_pber_3,log(trh3.pber_4) as log_pber_4,trd.fval_2, trd2.fval_3,trd3.fval_4 from d inner join trd on trd.id = d.id inner join ( trh inner join ts on (trh.id = ts.id and ts.ttype = trh.ttype and ts.start_timestamp = trh.start_timestamp)) on (d.id = trh.id and trh.id = trd.id and trd.ttype = trh.ttype and trd.start_timestamp = trh.start_timestamp and TRH.TTYPE IN (35,75,703,740,764)) left join ( trh trh2 inner join ts ts2 on (trh2.id = ts2.id and ts2.ttype = trh2.ttype and ts2.start_timestamp = trh2.start_timestamp) inner join trd trd2 on trd2.id = trh2.id and trd2.ttype = trh2.ttype and trd2.start_timestamp = trh2.start_timestamp ) on (d.id = trh2.id and ts2.seq_date = ts.seq_date and ts2.seq_time = ts.seq_time and trh2.hid=trh.hid and TRH2.TTYPE IN (69,178,198,704,757,741,765)) left join ( trh trh3 inner join ts ts3 on (trh3.id = ts3.id and ts3.ttype = trh3.ttype and ts3.start_timestamp = trh3.start_timestamp) inner join trd trd3 on trd3.id = trh3.id and trd3.ttype = trh3.ttype and trd3.start_timestamp = trh3.start_timestamp ) on (d.id = trh3.id and ts3.seq_date = ts.seq_date and ts3.seq_time = ts.seq_time and trh3.hid=trh.hid and TRH3.TTYPE IN (177,197,705,742,758,766)) where d.code = 'HUA69'
---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate