Hello 2010/7/20 Kneringer, Armin <Armin.Kneringer@xxxxxxxxxxxx>: > Hi there. > > I think I found a potential performance gain if the query planner would be optimized. All Tests has been performed with 8.4.1 (and earlier versions) on CentOS 5.3 (x64) > > The following query will run on my database (~250 GB) for ca. 1600 seconds and the sort will result in a disk merge deploying ca. 200 GB of data to the local disk (ca. 180.000 tmp-files) can you try show check explain with set enable_hashjoin to off; ? Regards Pavel Stehule > > explain SELECT DISTINCT t4.objid > FROM fscsubfile t4, cooobject t6 > NOT EXISTS ( > WHERE t6.objid = t4.objid AND > t4.fileresporgid = 573936067464397682 AND > NOT EXISTS ( > SELECT 1 > FROM ataggval q1_1, > atdateval t5 > WHERE q1_1.objid = t4.objid AND > q1_1.attrid = 281479288456451 AND > q1_1.aggrid = 0 AND > t5.aggrid = q1_1.aggval AND > t5.objid = t4.objid AND > t5.attrid = 281479288456447 ) AND > ((t6.objclassid IN (285774255832590,285774255764301))) AND > ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952)) > ORDER BY t4.objid; > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > Unique (cost=2592137103.99..2592137104.00 rows=1 width=8) > -> Sort (cost=2592137103.99..2592137104.00 rows=1 width=8) > Sort Key: t4.objid > -> Nested Loop (cost=1105592553.38..2592137103.98 rows=1 width=8) > -> Hash Anti Join (cost=1105592553.38..2592137095.75 rows=1 width=8) > Hash Cond: ((t4.objid = q1_1.objid) AND (t4.objid = t5.objid)) > -> Bitmap Heap Scan on fscsubfile t4 (cost=154.42...14136.40 rows=5486 width=8) > Recheck Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint)) > -> Bitmap Index Scan on ind_fscsubfile_filerespons (cost=0.00..153.05 rows=5486 width=0) > Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint)) > -> Hash (cost=11917516.57..11917516.57 rows=55006045159 width=16) > -> Nested Loop (cost=0.00..11917516.57 rows=55006045159 width=16) > -> Seq Scan on atdateval t5 (cost=0.00...294152.40 rows=1859934 width=12) > Filter: (attrid = 281479288456447::bigint) > -> Index Scan using ind_ataggval on ataggval q1_1 (cost=0.00..6.20 rows=4 width=12) > Index Cond: ((q1_1.attrid = 281479288456451::bigint) AND (q1_1.aggval = t5.aggrid)) > Filter: (q1_1.aggrid = 0) > -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22 rows=1 width=8) > Index Cond: (t6.objid = t4.objid) > Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[])) > (20 rows) > > > As the disks pace is limited on my test system I can't provide the "explain analyze" output > If I change the query as follows the query takes only 12 seconds and only needs 2 tmp files for sorting. > (Changed lines are marked with [!!!!!] as I don't know HTML-Mails will be delivered without conversion > > explain SELECT DISTINCT t4.objid > FROM fscsubfile t4, cooobject t6 > WHERE t6.objid = t4.objid AND > t4.fileresporgid = 573936067464397682 AND > NOT EXISTS ( > SELECT 1 > FROM ataggval q1_1, > atdateval t5 > WHERE q1_1.objid = t4.objid AND > q1_1.attrid = 281479288456451 AND > q1_1.aggrid = 0 AND > t5.aggrid = q1_1.aggval AND > t5.objid = q1_1.objid AND [!!!!!] > t5.attrid = 281479288456447 ) AND > ((t6.objclassid IN (285774255832590,285774255764301))) AND > ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952)) > ORDER BY t4.objid; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------ > Unique (cost=918320.29..971968.88 rows=1 width=8) > -> Nested Loop (cost=918320.29..971968.88 rows=1 width=8) > -> Merge Anti Join (cost=918320.29..971960.65 rows=1 width=8) > Merge Cond: (t4.objid = q1_1.objid) > -> Index Scan using ind_fscsubfile_filerespons on fscsubfile t4 (cost=0.00..19016.05 rows=5486 width=8) > Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint)) > -> Materialize (cost=912418.42..956599.36 rows=22689 width=8) > -> Merge Join (cost=912418.42..956372.47 rows=22689 width=8) > Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval)) > -> Sort (cost=402024.80..406674.63 rows=1859934 width=12) > Sort Key: t5.objid, t5.aggrid > -> Bitmap Heap Scan on atdateval t5 (cost=43749.07..176555.24 rows=1859934 width=12) > Recheck Cond: (attrid = 281479288456447::bigint) > -> Bitmap Index Scan on ind_atdateval (cost=0.00..43284.08 rows=1859934 width=0) > Index Cond: (attrid = 281479288456447::bigint) > -> Materialize (cost=510392.25..531663.97 rows=1701738 width=12) > -> Sort (cost=510392.25..514646.59 rows=1701738 width=12) > Sort Key: q1_1.objid, q1_1.aggval > -> Bitmap Heap Scan on ataggval q1_1 (cost=44666.00..305189.47 rows=1701738 width=12) > Recheck Cond: (attrid = 281479288456451::bigint) > Filter: (aggrid = 0) > -> Bitmap Index Scan on ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0) > Index Cond: (attrid = 281479288456451::bigint) > -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22 rows=1 width=8) > Index Cond: (t6.objid = t4.objid) > Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[])) > (26 rows) > > explain analyze SELECT DISTINCT t4.objid > FROM fscsubfile t4, cooobject t6 > WHERE t6.objid = t4.objid AND > t4.fileresporgid = 573936067464397682 AND > NOT EXISTS ( > SELECT 1 > FROM ataggval q1_1, > atdateval t5 > WHERE q1_1.objid = t4.objid AND > q1_1.attrid = 281479288456451 AND > q1_1.aggrid = 0 AND > t5.aggrid = q1_1.aggval AND > t5.objid = q1_1.objid AND [!!!!!] > t5.attrid = 281479288456447 ) AND > ((t6.objclassid IN (285774255832590,285774255764301))) AND > ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952)) > ORDER BY t4.objid; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Unique (cost=918320.29..971968.88 rows=1 width=8) (actual time=12079.598..12083.048 rows=64 loops=1) > -> Nested Loop (cost=918320.29..971968.88 rows=1 width=8) (actual time=12079.594..12083.010 rows=64 loops=1) > -> Merge Anti Join (cost=918320.29..971960.65 rows=1 width=8) (actual time=12037.524..12081.989 rows=108 loops=1) > Merge Cond: (t4.objid = q1_1.objid) > -> Index Scan using ind_fscsubfile_filerespons on fscsubfile t4 (cost=0.00..19016.05 rows=5486 width=8) (actual time=0.073..83.498 rows=63436 loops=1) > Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint)) > -> Materialize (cost=912418.42..956599.36 rows=22689 width=8) (actual time=8866.253..11753.055 rows=1299685 loops=1) > -> Merge Join (cost=912418.42..956372.47 rows=22689 width=8) (actual time=8866.246..11413.397 rows=1299685 loops=1) > Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval)) > -> Sort (cost=402024.80..406674.63 rows=1859934 width=12) (actual time=3133.362..3774.076 rows=1299685 loops=1) > Sort Key: t5.objid, t5.aggrid > Sort Method: external merge Disk: 47192kB > -> Bitmap Heap Scan on atdateval t5 (cost=43749.07..176555.24 rows=1859934 width=12) (actual time=282.454..1079.038 rows=1857906 loops=1) > Recheck Cond: (attrid = 281479288456447::bigint) > -> Bitmap Index Scan on ind_atdateval (cost=0.00..43284.08 rows=1859934 width=0) (actual time=258.749...258.749 rows=1857906 loops=1) > Index Cond: (attrid = 281479288456447::bigint) > -> Materialize (cost=510392.25..531663.97 rows=1701738 width=12) (actual time=5732.872..6683.784 rows=1299685 loops=1) > -> Sort (cost=510392.25..514646.59 rows=1701738 width=12) (actual time=5732.866..6387.188 rows=1299685 loops=1) > Sort Key: q1_1.objid, q1_1.aggval > Sort Method: external merge Disk: 39920kB > -> Bitmap Heap Scan on ataggval q1_1 (cost=44666.00..305189.47 rows=1701738 width=12) (actual time=1644.983..3634.044 rows=1857906 loops=1) > Recheck Cond: (attrid = 281479288456451::bigint) > Filter: (aggrid = 0) > -> Bitmap Index Scan on ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0) (actual time=1606.325..1606.325 rows=1877336 loops=1) > Index Cond: (attrid = 281479288456451::bigint) > -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22 rows=1 width=8) (actual time=0.009..0.009 rows=1 loops=108) > Index Cond: (t6.objid = t4.objid) > Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[])) > Total runtime: 12108.663 ms > (29 rows) > > > Another way to optimize my query is to change it as follows: > (Once again changes are marked with [!!!!!] > > explain SELECT DISTINCT t4.objid > FROM fscsubfile t4, cooobject t6 > WHERE t6.objid = t4.objid AND > t4.fileresporgid = 573936067464397682 AND > NOT EXISTS ( > SELECT 1 > FROM ataggval q1_1, > atdateval t5 > WHERE q1_1.objid = t5.objid AND [!!!!!] > q1_1.attrid = 281479288456451 AND > q1_1.aggrid = 0 AND > t5.aggrid = q1_1.aggval AND > t5.objid = t4.objid AND > t5.attrid = 281479288456447 ) AND > ((t6.objclassid IN (285774255832590,285774255764301))) AND > ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952)) > ORDER BY t4.objid; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------ > Unique (cost=916978.86..969139.72 rows=1 width=8) > -> Nested Loop (cost=916978.86..969139.72 rows=1 width=8) > -> Merge Anti Join (cost=916978.86..969131.49 rows=1 width=8) > Merge Cond: (t4.objid = t5.objid) > -> Index Scan using ind_fscsubfile_filerespons on fscsubfile t4 (cost=0.00..19016.05 rows=5486 width=8) > Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint)) > -> Materialize (cost=912418.42..956599.36 rows=22689 width=8) > -> Merge Join (cost=912418.42..956372.47 rows=22689 width=8) > Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval)) > -> Sort (cost=402024.80..406674.63 rows=1859934 width=12) > Sort Key: t5.objid, t5.aggrid > -> Bitmap Heap Scan on atdateval t5 (cost=43749.07..176555.24 rows=1859934 width=12) > Recheck Cond: (attrid = 281479288456447::bigint) > -> Bitmap Index Scan on ind_atdateval (cost=0.00..43284.08 rows=1859934 width=0) > Index Cond: (attrid = 281479288456447::bigint) > -> Materialize (cost=510392.25..531663.97 rows=1701738 width=12) > -> Sort (cost=510392.25..514646.59 rows=1701738 width=12) > Sort Key: q1_1.objid, q1_1.aggval > -> Bitmap Heap Scan on ataggval q1_1 (cost=44666.00..305189.47 rows=1701738 width=12) > Recheck Cond: (attrid = 281479288456451::bigint) > Filter: (aggrid = 0) > -> Bitmap Index Scan on ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0) > Index Cond: (attrid = 281479288456451::bigint) > -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22 rows=1 width=8) > Index Cond: (t6.objid = t4.objid) > Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[])) > (26 rows) > > > explain analyze SELECT DISTINCT t4.objid > FROM fscsubfile t4, cooobject t6 > WHERE t6.objid = t4.objid AND > t4.fileresporgid = 573936067464397682 AND > NOT EXISTS ( > SELECT 1 > FROM ataggval q1_1, > atdateval t5 > WHERE q1_1.objid = t5.objid AND [!!!!!] > q1_1.attrid = 281479288456451 AND > q1_1.aggrid = 0 AND > t5.aggrid = q1_1.aggval AND > t5.objid = t4.objid AND > t5.attrid = 281479288456447 ) AND > ((t6.objclassid IN (285774255832590,285774255764301))) AND > ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952)) > ORDER BY t4.objid; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Unique (cost=916978.86..969139.72 rows=1 width=8) (actual time=12102.964..12106.409 rows=64 loops=1) > -> Nested Loop (cost=916978.86..969139.72 rows=1 width=8) (actual time=12102.959..12106.375 rows=64 loops=1) > -> Merge Anti Join (cost=916978.86..969131.49 rows=1 width=8) (actual time=12060.916..12105.374 rows=108 loops=1) > Merge Cond: (t4.objid = t5.objid) > -> Index Scan using ind_fscsubfile_filerespons on fscsubfile t4 (cost=0.00..19016.05 rows=5486 width=8) (actual time=0.080..81.397 rows=63436 loops=1) > Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint)) > -> Materialize (cost=912418.42..956599.36 rows=22689 width=8) (actual time=8874.492..11778.254 rows=1299685 loops=1) > -> Merge Join (cost=912418.42..956372.47 rows=22689 width=8) (actual time=8874.484..11437.175 rows=1299685 loops=1) > Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval)) > -> Sort (cost=402024.80..406674.63 rows=1859934 width=12) (actual time=3117.555..3756.062 rows=1299685 loops=1) > Sort Key: t5.objid, t5.aggrid > Sort Method: external merge Disk: 39920kB > -> Bitmap Heap Scan on atdateval t5 (cost=43749.07..176555.24 rows=1859934 width=12) (actual time=289.475..1079.624 rows=1857906 loops=1) > Recheck Cond: (attrid = 281479288456447::bigint) > -> Bitmap Index Scan on ind_atdateval (cost=0.00..43284.08 rows=1859934 width=0) (actual time=265.720...265.720 rows=1857906 loops=1) > Index Cond: (attrid = 281479288456447::bigint) > -> Materialize (cost=510392.25..531663.97 rows=1701738 width=12) (actual time=5756.915..6707.864 rows=1299685 loops=1) > -> Sort (cost=510392.25..514646.59 rows=1701738 width=12) (actual time=5756.909..6409.819 rows=1299685 loops=1) > Sort Key: q1_1.objid, q1_1.aggval > Sort Method: external merge Disk: 39920kB > -> Bitmap Heap Scan on ataggval q1_1 (cost=44666.00..305189.47 rows=1701738 width=12) (actual time=1646.955..3628.918 rows=1857906 loops=1) > Recheck Cond: (attrid = 281479288456451::bigint) > Filter: (aggrid = 0) > -> Bitmap Index Scan on ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0) (actual time=1608.233..1608.233 rows=1877336 loops=1) > Index Cond: (attrid = 281479288456451::bigint) > -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22 rows=1 width=8) (actual time=0.008..0.009 rows=1 loops=108) > Index Cond: (t6.objid = t4.objid) > Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[])) > Total runtime: 12129.613 ms > (29 rows) > > > > As the query performs in roughly 12 seconds in both (changed) cases you might advise to change my queries :-) > (In fact we are working on this) > As the primary performance impact can also be reproduced in a small database (querytime > 1 minute) I checked this issue on MS-SQL server and Oracle. On MSSQL server there is no difference in the execution plan if you change the query an the performance is well. Oralce shows a slightly difference but the performance is also well. > As I mentioned we are looking forward to change our query but in my opinion there could be a general performance gain if this issue is addressed. (especially if you don't know you run into this issue on the query performance is sufficient enough) > > greets > Armin > > -- > Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance