directory tree query with big planner variation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi group,

this is a directory tree query for a backup system (http:// sourceforge.net/projects/bacula). You provide a path and get back the names of the children plus a boolean telling if the child has itself children.
The "%@" stands for the initial path:
---------------------------------------------------------------
explain analyze  SELECT X.name AS name, COUNT(CH) > 1 AS children
  FROM
    ( SELECT  RTRIM( REPLACE( NLPC.path, '%@/', ''),'/') AS name,
                                                 FN.name AS CH
        FROM
          ( SELECT P.path,P.pathid
              FROM bacula.path P
              WHERE P.path ~ '^%@/[^/]*/$' ) AS NLPC
          LEFT OUTER JOIN bacula.file F
            ON
              NLPC.pathid = F.pathid
          LEFT OUTER JOIN bacula.filename FN
            ON
              F.filenameid = FN.filenameid
        GROUP BY NLPC.path, FN.name
      UNION
      SELECT  FN.name AS name, FN.name AS CH
        FROM
          bacula.path P, bacula.file F, bacula.filename FN
        WHERE
          P.path = '%@/'   AND
          P.pathid = F.pathid                           AND
          F.filenameid = FN.filenameid
    ) AS X
  WHERE X.name <> ''
  GROUP BY X.name
---------------------------------------------------------------
The 1st part of the union takes care of directories, the 2nd one of flat files. Application allows user navigation in a browser (clicking on a name in one column triggers the query and fills the next browser column).
Initial path of "/Users/axel/Library/Preferences" results in:
---------------------------------------------------------------
Sort (cost=1295.24..1295.47 rows=92 width=64) (actual time=818.987..819.871 rows=527 loops=1)
   Sort Key: upper(x.name)
-> GroupAggregate (cost=1288.56..1292.24 rows=92 width=64) (actual time=784.069..814.059 rows=527 loops=1) -> Unique (cost=1288.56..1289.25 rows=92 width=112) (actual time=783.931..809.708 rows=684 loops=1) -> Sort (cost=1288.56..1288.79 rows=92 width=112) (actual time=783.921..793.150 rows=5350 loops=1)
                     Sort Key: name, ch
-> Append (cost=642.03..1285.55 rows=92 width=112) (actual time=335.134..723.917 rows=5350 loops=1) -> Subquery Scan "*SELECT* 1" (cost=642.03..643.18 rows=46 width=112) (actual time=335.130..338.564 rows=184 loops=1) -> HashAggregate (cost=642.03..642.72 rows=46 width=112) (actual time=335.121..337.843 rows=184 loops=1) -> Nested Loop Left Join (cost=2.00..641.80 rows=46 width=112) (actual time=39.293..326.831 rows=1685 loops=1) -> Nested Loop Left Join (cost=0.00..502.63 rows=46 width=97) (actual time=21.026..202.206 rows=1685 loops=1) -> Index Scan using path_name_idx on path p (cost=0.00..3.02 rows=1 width=97) (actual time=15.480..56.935 rows=27 loops=1) Index Cond: ((path >= '/Users/axel/Library/Preferences/'::text) AND (path < '/ Users/axel/Library/Preferences0'::text)) Filter: ((path ~ '^/Users/axel/Library/Preferences/[^/]*/$'::text) AND (rtrim ("replace"(path, '/Users/axel/Library/Preferences/'::text, ''::text), '/'::text) <> ''::text)) -> Index Scan using file_path_idx on file f (cost=0.00..493.28 rows=506 width=8) (actual time=0.473..5.119 rows=62 loops=27) Index Cond: ("outer".pathid = f.pathid) -> Bitmap Heap Scan on filename fn (cost=2.00..3.01 rows=1 width=23) (actual time=0.058..0.061 rows=1 loops=1685) Recheck Cond: ("outer".filenameid = fn.filenameid) -> Bitmap Index Scan on filename_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1685) Index Cond: ("outer".filenameid = fn.filenameid) -> Nested Loop (cost=2.00..641.91 rows=46 width=19) (actual time=3.349..377.758 rows=5166 loops=1) -> Nested Loop (cost=0.00..502.62 rows=46 width=4) (actual time=3.118..97.375 rows=5200 loops=1) -> Index Scan using path_name_idx on path p (cost=0.00..3.01 rows=1 width=4) (actual time=0.045..0.052 rows=1 loops=1) Index Cond: (path = '/ Users/axel/Library/Preferences/'::text) -> Index Scan using file_path_idx on file f (cost=0.00..493.28 rows=506 width=8) (actual time=3.058..76.014 rows=5200 loops=1) Index Cond: ("outer".pathid = f.pathid) -> Bitmap Heap Scan on filename fn (cost=2.00..3.02 rows=1 width=23) (actual time=0.037..0.039 rows=1 loops=5200) Recheck Cond: ("outer".filenameid = fn.filenameid)
                                       Filter: (name <> ''::text)
-> Bitmap Index Scan on filename_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.018..0.018 rows=1 loops=5200) Index Cond: ("outer".filenameid = fn.filenameid)
Total runtime: 832.458 ms
---------------------------------------------------------------
which is ok, but initial path of "/Users/axel" results in (which is not ok):
---------------------------------------------------------------
Sort (cost=125533.67..125534.17 rows=200 width=64) (actual time=84273.963..84274.260 rows=181 loops=1)
   Sort Key: upper(x.name)
-> GroupAggregate (cost=123493.01..125526.03 rows=200 width=64) (actual time=84263.411..84272.427 rows=181 loops=1) -> Unique (cost=123493.01..124169.51 rows=90201 width=112) (actual time=84263.215..84270.129 rows=522 loops=1) -> Sort (cost=123493.01..123718.51 rows=90201 width=112) (actual time=84263.208..84265.632 rows=1432 loops=1)
                     Sort Key: name, ch
-> Append (cost=113172.83..116069.08 rows=90201 width=112) (actual time=83795.274..84251.660 rows=1432 loops=1) -> Subquery Scan "*SELECT* 1" (cost=113172.83..115426.71 rows=90155 width=112) (actual time=83795.270..83803.996 rows=410 loops=1) -> HashAggregate (cost=113172.83..114525.16 rows=90155 width=112) (actual time=83795.258..83802.369 rows=410 loops=1) -> Hash Left Join (cost=3124.38..112722.06 rows=90155 width=112) (actual time=56254.547..83779.903 rows=3648 loops=1) Hash Cond: ("outer".filenameid = "inner".filenameid) -> Merge Left Join (cost=0.00..106216.87 rows=90155 width=97) (actual time=54926.198..82430.621 rows=3648 loops=1) Merge Cond: ("outer".pathid = "inner".pathid) -> Index Scan using path_pkey on path p (cost=0.00..2567.57 rows=1941 width=97) (actual time=527.805..1521.911 rows=69 loops=1) Filter: ((path ~ '^/Users/axel/[^/]*/$'::text) AND (rtrim("replace"(path, '/ Users/axel/'::text, ''::text), '/'::text) <> ''::text)) -> Index Scan using file_path_idx on file f (cost=0.00..95191.99 rows=3020363 width=8) (actual time=17.561..74392.318 rows=2941790 loops=1) -> Hash (cost=2723.30..2723.30 rows=160430 width=23) (actual time=1299.103..1299.103 rows=160430 loops=1) -> Seq Scan on filename fn (cost=0.00..2723.30 rows=160430 width=23) (actual time=3.884..684.918 rows=160430 loops=1) -> Nested Loop (cost=2.00..641.91 rows=46 width=19) (actual time=93.252..442.196 rows=1022 loops=1) -> Nested Loop (cost=0.00..502.62 rows=46 width=4) (actual time=49.375..209.694 rows=1050 loops=1) -> Index Scan using path_name_idx on path p (cost=0.00..3.01 rows=1 width=4) (actual time=29.455..29.462 rows=1 loops=1) Index Cond: (path = '/ Users/axel/'::text) -> Index Scan using file_path_idx on file f (cost=0.00..493.28 rows=506 width=8) (actual time=19.898..175.869 rows=1050 loops=1) Index Cond: ("outer".pathid = f.pathid) -> Bitmap Heap Scan on filename fn (cost=2.00..3.02 rows=1 width=23) (actual time=0.206..0.208 rows=1 loops=1050) Recheck Cond: ("outer".filenameid = fn.filenameid)
                                       Filter: (name <> ''::text)
-> Bitmap Index Scan on filename_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.087..0.087 rows=1 loops=1050) Index Cond: ("outer".filenameid = fn.filenameid)
 Total runtime: 84295.927 ms
---------------------------------------------------------------
It happened once that the planner resolved the 2nd query with the 1st plan, but this is not reproducible.
How can I avoid the 2nd plan?

This is 8.1.4 on OpenBSD 3.9 with 2x1GHz PIII and 2GB.
Axel
Axel Rau, ☀Frankfurt , Germany                       +49-69-951418-0




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux