potential performance gain by query planner optimization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi there.

I think I found a potential performance gain if the query planner would be optimized. All Tests has been performed with 8.4.1 (and earlier versions) on CentOS 5.3 (x64)

The following query will run on my database (~250 GB) for ca. 1600 seconds and the sort will result in a disk merge deploying ca. 200 GB of data to the local disk (ca. 180.000 tmp-files)

explain SELECT DISTINCT t4.objid
FROM fscsubfile t4, cooobject t6
 NOT EXISTS (
 WHERE t6.objid = t4.objid AND
 t4.fileresporgid = 573936067464397682 AND
   NOT EXISTS (
   SELECT 1
   FROM ataggval q1_1,
   atdateval t5
   WHERE q1_1.objid = t4.objid AND
   q1_1.attrid = 281479288456451 AND
   q1_1.aggrid = 0 AND
   t5.aggrid = q1_1.aggval AND
   t5.objid = t4.objid AND
   t5.attrid = 281479288456447 ) AND
 ((t6.objclassid IN (285774255832590,285774255764301))) AND
 ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
 ORDER BY t4.objid;

                                                                                  QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique  (cost=2592137103.99..2592137104.00 rows=1 width=8)
   ->  Sort  (cost=2592137103.99..2592137104.00 rows=1 width=8)
         Sort Key: t4.objid
         ->  Nested Loop  (cost=1105592553.38..2592137103.98 rows=1 width=8)
               ->  Hash Anti Join  (cost=1105592553.38..2592137095.75 rows=1 width=8)
                     Hash Cond: ((t4.objid = q1_1.objid) AND (t4.objid = t5.objid))
                     ->  Bitmap Heap Scan on fscsubfile t4  (cost=154.42...14136.40 rows=5486 width=8)
                           Recheck Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
                           ->  Bitmap Index Scan on ind_fscsubfile_filerespons  (cost=0.00..153.05 rows=5486 width=0)
                                 Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
                     ->  Hash  (cost=11917516.57..11917516.57 rows=55006045159 width=16)
                           ->  Nested Loop  (cost=0.00..11917516.57 rows=55006045159 width=16)
                                 ->  Seq Scan on atdateval t5  (cost=0.00...294152.40 rows=1859934 width=12)
                                       Filter: (attrid = 281479288456447::bigint)
                                 ->  Index Scan using ind_ataggval on ataggval q1_1  (cost=0.00..6.20 rows=4 width=12)
                                       Index Cond: ((q1_1.attrid = 281479288456451::bigint) AND (q1_1.aggval = t5.aggrid))
                                       Filter: (q1_1.aggrid = 0)
               ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8)
                     Index Cond: (t6.objid = t4.objid)
                     Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
(20 rows)


As the disks pace is limited on my test system I can't provide the "explain analyze" output
If I change the query as follows the query takes only 12 seconds and only needs 2 tmp files for sorting.
(Changed lines are marked with [!!!!!] as I don't know HTML-Mails will be delivered without conversion

explain SELECT DISTINCT t4.objid
FROM fscsubfile t4, cooobject t6
WHERE t6.objid = t4.objid AND
t4.fileresporgid = 573936067464397682 AND
   NOT EXISTS (
   SELECT 1
   FROM ataggval q1_1,
   atdateval t5
   WHERE q1_1.objid = t4.objid AND
   q1_1.attrid = 281479288456451 AND
   q1_1.aggrid = 0 AND
   t5.aggrid = q1_1.aggval AND
   t5.objid = q1_1.objid AND                 [!!!!!]
   t5.attrid = 281479288456447 ) AND
   ((t6.objclassid IN (285774255832590,285774255764301))) AND
   ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
 ORDER BY t4.objid;
                                                                            QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique  (cost=918320.29..971968.88 rows=1 width=8)
   ->  Nested Loop  (cost=918320.29..971968.88 rows=1 width=8)
         ->  Merge Anti Join  (cost=918320.29..971960.65 rows=1 width=8)
               Merge Cond: (t4.objid = q1_1.objid)
               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8)
                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8)
                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8)
                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12)
                                 Sort Key: t5.objid, t5.aggrid
                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934 width=12)
                                       Recheck Cond: (attrid = 281479288456447::bigint)
                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0)
                                             Index Cond: (attrid = 281479288456447::bigint)
                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12)
                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12)
                                       Sort Key: q1_1.objid, q1_1.aggval
                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738 width=12)
                                             Recheck Cond: (attrid = 281479288456451::bigint)
                                             Filter: (aggrid = 0)
                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0)
                                                   Index Cond: (attrid = 281479288456451::bigint)
         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8)
               Index Cond: (t6.objid = t4.objid)
               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
(26 rows)

explain analyze SELECT DISTINCT t4.objid
FROM fscsubfile t4, cooobject t6
WHERE t6.objid = t4.objid AND
t4.fileresporgid = 573936067464397682 AND
  NOT EXISTS (
  SELECT 1
  FROM ataggval q1_1,
  atdateval t5
  WHERE q1_1.objid = t4.objid AND
  q1_1.attrid = 281479288456451 AND
  q1_1.aggrid = 0 AND
  t5.aggrid = q1_1.aggval AND
  t5.objid = q1_1.objid AND                 [!!!!!]
  t5.attrid = 281479288456447 ) AND
((t6.objclassid IN (285774255832590,285774255764301))) AND
((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
ORDER BY t4.objid;
                                                                                     QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique  (cost=918320.29..971968.88 rows=1 width=8) (actual time=12079.598..12083.048 rows=64 loops=1)
   ->  Nested Loop  (cost=918320.29..971968.88 rows=1 width=8) (actual time=12079.594..12083.010 rows=64 loops=1)
         ->  Merge Anti Join  (cost=918320.29..971960.65 rows=1 width=8) (actual time=12037.524..12081.989 rows=108 loops=1)
               Merge Cond: (t4.objid = q1_1.objid)
               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8) (actual time=0.073..83.498 rows=63436 loops=1)
                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8) (actual time=8866.253..11753.055 rows=1299685 loops=1)
                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8) (actual time=8866.246..11413.397 rows=1299685 loops=1)
                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12) (actual time=3133.362..3774.076 rows=1299685 loops=1)
                                 Sort Key: t5.objid, t5.aggrid
                                 Sort Method:  external merge  Disk: 47192kB
                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934 width=12) (actual time=282.454..1079.038 rows=1857906 loops=1)
                                       Recheck Cond: (attrid = 281479288456447::bigint)
                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0) (actual time=258.749...258.749 rows=1857906 loops=1)
                                             Index Cond: (attrid = 281479288456447::bigint)
                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12) (actual time=5732.872..6683.784 rows=1299685 loops=1)
                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12) (actual time=5732.866..6387.188 rows=1299685 loops=1)
                                      Sort Key: q1_1.objid, q1_1.aggval
                                       Sort Method:  external merge  Disk: 39920kB
                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738 width=12) (actual time=1644.983..3634.044 rows=1857906 loops=1)
                                             Recheck Cond: (attrid = 281479288456451::bigint)
                                             Filter: (aggrid = 0)
                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0) (actual time=1606.325..1606.325 rows=1877336 loops=1)
                                                   Index Cond: (attrid = 281479288456451::bigint)
         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8) (actual time=0.009..0.009 rows=1 loops=108)
               Index Cond: (t6.objid = t4.objid)
               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
Total runtime: 12108.663 ms
(29 rows)


Another way to optimize my query is to change it as follows:
(Once again changes are marked with [!!!!!]

explain SELECT DISTINCT t4.objid
FROM fscsubfile t4, cooobject t6
WHERE t6.objid = t4.objid AND
t4.fileresporgid = 573936067464397682 AND
   NOT EXISTS (
   SELECT 1
   FROM ataggval q1_1,
   atdateval t5
   WHERE q1_1.objid = t5.objid AND                 [!!!!!]
   q1_1.attrid = 281479288456451 AND
   q1_1.aggrid = 0 AND
   t5.aggrid = q1_1.aggval AND
   t5.objid = t4.objid AND
   t5.attrid = 281479288456447 ) AND
 ((t6.objclassid IN (285774255832590,285774255764301))) AND
 ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
 ORDER BY t4.objid;
                                                                            QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique  (cost=916978.86..969139.72 rows=1 width=8)
   ->  Nested Loop  (cost=916978.86..969139.72 rows=1 width=8)
         ->  Merge Anti Join  (cost=916978.86..969131.49 rows=1 width=8)
               Merge Cond: (t4.objid = t5.objid)
               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8)
                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8)
                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8)
                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12)
                                 Sort Key: t5.objid, t5.aggrid
                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934 width=12)
                                       Recheck Cond: (attrid = 281479288456447::bigint)
                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0)
                                             Index Cond: (attrid = 281479288456447::bigint)
                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12)
                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12)
                                       Sort Key: q1_1.objid, q1_1.aggval
                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738 width=12)
                                             Recheck Cond: (attrid = 281479288456451::bigint)
                                             Filter: (aggrid = 0)
                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0)
                                                   Index Cond: (attrid = 281479288456451::bigint)
         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8)
               Index Cond: (t6.objid = t4.objid)
               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
(26 rows)


explain analyze SELECT DISTINCT t4.objid
FROM fscsubfile t4, cooobject t6
WHERE t6.objid = t4.objid AND
t4.fileresporgid = 573936067464397682 AND
  NOT EXISTS (
  SELECT 1
  FROM ataggval q1_1,
  atdateval t5
  WHERE q1_1.objid = t5.objid AND                 [!!!!!]
  q1_1.attrid = 281479288456451 AND
  q1_1.aggrid = 0 AND
  t5.aggrid = q1_1.aggval AND
  t5.objid = t4.objid AND
  t5.attrid = 281479288456447 ) AND
((t6.objclassid IN (285774255832590,285774255764301))) AND
((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
ORDER BY t4.objid;
                                                                                     QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Unique  (cost=916978.86..969139.72 rows=1 width=8) (actual time=12102.964..12106.409 rows=64 loops=1)
   ->  Nested Loop  (cost=916978.86..969139.72 rows=1 width=8) (actual time=12102.959..12106.375 rows=64 loops=1)
         ->  Merge Anti Join  (cost=916978.86..969131.49 rows=1 width=8) (actual time=12060.916..12105.374 rows=108 loops=1)
               Merge Cond: (t4.objid = t5.objid)
               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8) (actual time=0.080..81.397 rows=63436 loops=1)
                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8) (actual time=8874.492..11778.254 rows=1299685 loops=1)
                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8) (actual time=8874.484..11437.175 rows=1299685 loops=1)
                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12) (actual time=3117.555..3756.062 rows=1299685 loops=1)
                                 Sort Key: t5.objid, t5.aggrid
                                 Sort Method:  external merge  Disk: 39920kB
                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934 width=12) (actual time=289.475..1079.624 rows=1857906 loops=1)
                                       Recheck Cond: (attrid = 281479288456447::bigint)
                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0) (actual time=265.720...265.720 rows=1857906 loops=1)
                                             Index Cond: (attrid = 281479288456447::bigint)
                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12) (actual time=5756.915..6707.864 rows=1299685 loops=1)
                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12) (actual time=5756.909..6409.819 rows=1299685 loops=1)
                                       Sort Key: q1_1.objid, q1_1.aggval
                                       Sort Method:  external merge  Disk: 39920kB
                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738 width=12) (actual time=1646.955..3628.918 rows=1857906 loops=1)
                                             Recheck Cond: (attrid = 281479288456451::bigint)
                                             Filter: (aggrid = 0)
                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0) (actual time=1608.233..1608.233 rows=1877336 loops=1)
                                                   Index Cond: (attrid = 281479288456451::bigint)
         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8) (actual time=0.008..0.009 rows=1 loops=108)
               Index Cond: (t6.objid = t4.objid)
               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
Total runtime: 12129.613 ms
(29 rows)



As the query performs in roughly 12 seconds in both (changed) cases you might advise to change my queries :-)
(In fact we are working on this)
As the primary performance impact can also be reproduced in a small database (querytime > 1 minute) I checked this issue on MS-SQL server and Oracle. On MSSQL server there is no difference in the execution plan if you change the query an the performance is well. Oralce shows a slightly difference but the performance is also well.
As I mentioned we are looking forward to change our query but in my opinion there could be a general performance gain if this issue is addressed. (especially if you don't know you run into this issue on the query performance is sufficient enough)

greets
Armin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux