ogr2ogr slow sql when checking system tables for column info and so on.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hei

The main problem is that for instance ogr2ogr is using more time to get system info about tables than doing the actual job.

The time pick up postgresql meta info takes between 30 and 60 seconds and sometimes hours if we have not done vacuum analyze recenlty. 
Then actual spatial jobs takes less than 10 seconds.

Before I run ogr2ogr I do vacuum analyze

 schemaname |    relname     | n_live_tup | n_dead_tup |        last_autovacuum        
------------+----------------+------------+------------+-------------------------------
 pg_catalog | pg_class       |     215296 |       4365 | 2022-10-18 10:24:05.745915+02
 pg_catalog | pg_attribute   |    1479648 |      18864 | 2022-10-18 12:36:52.820133+02
 pg_catalog | pg_type        |     200777 |       2318 | 2022-10-18 06:33:58.598257+02
 pg_catalog | pg_constraint  |      10199 |        104 | 2022-10-20 15:10:57.894674+02
 pg_catalog | pg_namespace   |        860 |          1 | [NULL]
 pg_catalog | pg_description |       9119 |          0 | 2022-05-06 01:59:58.664618+02
(6 rows)

VACUUM ANALYZE pg_catalog.pg_class;
VACUUM ANALYZE pg_catalog.pg_attribute;
VACUUM ANALYZE pg_catalog.pg_namespace;
VACUUM ANALYZE pg_catalog.pg_type;
VACUUM ANALYZE pg_catalog.pg_constraint;
VACUUM ANALYZE pg_catalog.pg_description;

After running, we have this values

schemaname |    relname     | n_live_tup | n_dead_tup |        last_autovacuum        
------------+----------------+------------+------------+-------------------------------
 pg_catalog | pg_class       |     221739 |        464 | 2022-10-18 10:24:05.745915+02
 pg_catalog | pg_namespace   |        860 |          2 |
 pg_catalog | pg_attribute   |    1464900 |       1672 | 2022-10-18 12:36:52.820133+02
 pg_catalog | pg_constraint  |      10200 |          8 | 2022-10-20 15:10:57.894674+02
 pg_catalog | pg_type        |     204936 |         93 | 2022-10-18 06:33:58.598257+02
 pg_catalog | pg_description |       9119 |          0 | 2022-05-06 01:59:58.664618+02
(6 rows)

Here https://explain.depesz.com/s/oU19#stats the sql generated by ogr2ogr that takes 33 seconds in this sample

Then we take copy of the pg_catalog tables involved.

CREATE SCHEMA test_pg_metadata;                                  

CREATE TABLE  test_pg_metadata.pg_class ( like pg_class including all);
INSERT INTO test_pg_metadata.pg_class SELECT * FROM pg_class;

-- CREATE TABLE  test_pg_metadata.pg_attribute ( like pg_attribute including all);
-- Failes with ERROR:  42P16: column "attmissingval" has pseudo-type anyarray
-- has to do it this way
CREATE TABLE test_pg_metadata.pg_attribute AS SELECT
attcollation,attrelid,attname,atttypid,attstattarget,attlen,attnum,attndims,attcacheoff,atttypmod,attbyval,attstorage,attalign,attnotnull,atthasdef,atthasmissing,attidentity,attgenerated,attisdropped,attislocal,attinhcount,attacl,attoptions,attfdwoptions
FROM pg_attribute;
CREATE UNIQUE INDEX ON test_pg_metadata.pg_attribute(attrelid, attnum);
CREATE UNIQUE INDEX ON test_pg_metadata.pg_attribute(attrelid, attname);

CREATE TABLE  test_pg_metadata.pg_namespace ( like pg_namespace including all);
INSERT INTO test_pg_metadata.pg_namespace SELECT * FROM pg_namespace;

CREATE TABLE  test_pg_metadata.pg_type ( like pg_type including all);
INSERT INTO test_pg_metadata.pg_type SELECT * FROM pg_type;

CREATE TABLE  test_pg_metadata.pg_constraint ( like pg_constraint including all);
INSERT INTO test_pg_metadata.pg_constraint SELECT * FROM pg_constraint;
   
CREATE TABLE  test_pg_metadata.pg_description ( like pg_description including all);
INSERT INTO test_pg_metadata.pg_description SELECT * FROM pg_description;

There is no primary key on pg_attribute but that does not make any difference when testing, it seems like.

Here https://explain.depesz.com/s/NEwB#source is the trace when using the same sql as from ogr2ogr but using the tables in test_pg_metadata and then it runs in 5 seconds.

I do not understand way it's so much slower to use the tables in  pg_catalog than in test_pg_metadata tables because they have the same content.

I also tried to create a new index on pg_catalog.pg_attribute to check if that could help but that was not allowed, I was running as the postgres user.

CREATE INDEX ON pg_catalog.pg_attribute(atttypid);
ERROR:  42501: permission denied: "pg_attribute" is a system catalog
LOCATION:  RangeVarCallbackOwnsRelation, tablecmds.c:16486

We run on
PostgreSQL 12.6 (Ubuntu 12.6-0ubuntu0.20.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
POSTGIS="3.1.1 aaf4c79" [EXTENSION] PGSQL="120" GEOS="3.9.0-CAPI-1.16.2" SFCGAL="1.3.7" PROJ="7.2.1" GDAL="GDAL 3.2.1, released 2020/12/29" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY RASTER

Thanks

Lars


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux