Re: directory tree query with big planner variation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Am 31.07.2006 um 15:30 schrieb Michael Stone:

If I understand the intend of this SQL,
Let me show the tables first:
                           Table "bacula.path"        (  65031 rows)
Column |  Type   |                       Modifiers
--------+--------- +-------------------------------------------------------
pathid | integer | not null default nextval('path_pathid_seq'::regclass)
path | text | not null ( complete pathnames of all directories )
Indexes:
    "path_pkey" PRIMARY KEY, btree (pathid)
    "path_name_idx" btree (path)

                             Table "bacula.file"       (3021903 rows)
   Column   |  Type   |                       Modifiers
------------+--------- +------------------------------------------------------- fileid | integer | not null default nextval ('file_fileid_seq'::regclass)
fileindex  | integer | not null default 0
jobid      | integer | not null
pathid     | integer | not null				(FK)
filenameid | integer | not null				(FK)
markid     | integer | not null default 0
lstat      | text    | not null
md5        | text    | not null
Indexes:
    "file_pkey" PRIMARY KEY, btree (fileid)
    "file_fp_idx" btree (filenameid, pathid)
    "file_jobid_idx" btree (jobid)
    "file_path_idx" btree (pathid)

                               Table "bacula.filename" ( 160559 rows)
   Column   |  Type   |                           Modifiers
------------+--------- +--------------------------------------------------------------- filenameid | integer | not null default nextval ('filename_filenameid_seq'::regclass)
name       | text    | not null
Indexes:
    "filename_pkey" PRIMARY KEY, btree (filenameid)
    "filename_name_idx" btree (name)

And now the query;

Task: Return the names of subdirectories and files immediately below a given path. For each none-empty subdirectory return children=true. The 1st part of the union selects all subdirecories (per regex) and the flatfiles contained in them plus one entry for the subdirectory itself (left outer joins). More than one joined filename means: "The subdirectory has children". The 2nd part of the union returns all flatfiles, contained in the given path. The surrounding SELECT removes the given path and the trailing "/" keeping only the subdirectory names from the pathnames, so they can be merged with the flatfile names.

you're pulling all the entries
in a directory in two parts. The first
(second)
part (files) is fairly straightforward. The second
(first)
part (directories) consists of pulling any file whose parent is a subdirectory of the directory you're looking for (this is *all* children of the directory, since you have to retrieve every element that begins with the directory, then discard those that have an additional / in their name), counting how many of these there are for each subdirectory, and discarding those results except for a binary (yes there are children or no there aren't). This is a lot of useless work to go through, and is going to be slow if you've got a lot of stuff in a subdirectory.
I agree, but did not yet find another way.
An alternative approach would be, for each directory, to store all its children (files and subdirectories) along with a flag indicating which it is. This would allow you to create the collapsed tree view without walking all the children of a subdirectory.
Perhaps in a temporary table?

Assuming you can't make changes to the schema, what about the query?
Can be changed.
You've got this:
Please reconsider your proposals with the above

It's hard to say without knowing what's actually *in* the tables, but the existing query definately doesn't scale well for what I think it's trying to do.

Mike Stone
Axel
Axel Rau, ☀Frankfurt , Germany                       +49-69-951418-0




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux