Search Postgresql Archives

Re: postgres slower on nested queries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Richard Huxton schrieb:
Alexander Elgert wrote:
Hello,

I programmed a little script which iterates over all databases in a DBMS, iterating over all tables and then over all columns.
This skript works for mysql and postgres.

1. Solution overview
====================

foreach database {
   foreach table {
       foreach column {
           do something ...
       }
   }
}

OK - obviously not the way to do it if you really want all the columns in all the databases...

1. Solution detail
==================

foreach database
----------------
mysql: show databases
postgres: SELECT d.datname as Name FROM pg_catalog.pg_database d WHERE (d.datname != 'template0' AND d.datname != 'template1') ORDER BY 1;)

You can have template databases called other names too of course.

foreach table
-------------
mysql: show tables
postgres: select table_name from information_schema.tables where table_schema = 'public')

foreach column
--------------
mysql: show columns from '%s'
postgres: select column_name from information_schema.columns where table_name = '%s')

(If there are better queries for postgres, please let me know.)

2. Solution
===========

I found the postgres version VERY slow, so a decided to fetch

Define VERY - it took what, milliseconds to do this? Seconds? Hours?

I wrote two different functions to get the data out of the database, the one uses a huge amount of queries, the other uses an array to hold the data. Because in mysql I have to use "show tables/show columns from ..." I thought I can use the same style in postgres.

Of course, it works, but the unoptimized postgres code is 80 times slower. ;( The overhead for a single 'show columns from ...' seems to be small in mysql.
Please read on, I hope I am able to show you the difference.


select table_name, column_name from information_schema.columns where table_schema = 'public'

and wrote the output to an two dimensional array to process the elements later: $tableA[$row['table_name']][$row['column_name']]=True;

Not sure what this code is meant to show.
foreach database {
  foreach table_name, column_name {
      $tableA[$row['table_name']][$row['column_name']]=True;
  }
  // process 2-dim $tableA array
}

This results in a structure where I can itereate over all keys in the 2-dim array. You can see I iterate first over the databases and then over table AND columns!
--- mysql: ~1s (Database X)
--- postgres: ~1s (Database Y)
;)

In contrast: =======================================================

foreach database {
  foreach table {
      foreach column {
          do something ...
      }
  }
}
--- mysql: ~1s (Database X)
--- postgres: ~80s (Database Y)
;(


The second approach ist much faster, this must be because there is no nesting. ;(

What nesting? Are you trying to do sub-queries of some sort?
I did a loop over all tables and THEN calling a query for each table to get the columns (from the same table).
Yes, there are definitively more queries the DBMS has to manage.
(It is a bad style, but it is intuitive. Maybe the overhead of a single query is more time consuming than in mysql.)

In your first example, you were querying for individual rows, in the second you fetch them all at once. Of course it's quicker to run one query and return many rows than many queries each returning one row. Otherwise your RDBMS has something seriously wrong with it.

Further - I think - there is no real caching done in the PHP Library of xampp 1.5.4, php_pgsql.dll ;(

No (not that I'm sure what xampp is). Look into memcache/pg_memcache if you want caching - that does it properly.
Thanks.
(xampp is an environment to develop wep applications.)


It would be very helpful what causes the problem.

You haven't actually described a problem afaict. Can you describe what it is you're trying to do? Do you want:
  1. Details of a single column
  2. To loop through all the columns doing something
If you can say which of these you're trying to do, and what the "something" is, then I'm sure there's somebody here who can help.
I think it will be the best to provide example code.

Another question rises from this task:
======================================

Is there any possibility to use the shortcut \d as a query from PHP?
I used psql -E to display the query, but these queries rely on the
system catalogs, but the page:
http://www.postgresql.org/docs/8.2/interactive/information-schema.html
says they are not stable.
And - of course - it is much easier to type! ;)

Nope - \d is psql only. That's because what it does changes with the system catalogues. If you want a stable representation, use the information_schema. If you want all the PostgreSQL-specific details you'll have to cope with changes to the system catalogues. However, I seem to remember some utility views that pgadmin uses that might be of use.
It would be nice to have the \d-query stored somewhere in the database.

Oh - and you might find phppgadmin useful to look at.

Thanks.

Greetings,
   Alexander Elgert



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux