Re: ogr2ogr slow sql when checking system tables for column info and so on.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Fri, Oct 21, 2022 at 09:19:58AM +0000, Lars Aksel Opsahl wrote:
>
> The main problem is that for instance ogr2ogr is using more time to get system info about tables than doing the actual job.
>
> The time pick up postgresql meta info takes between 30 and 60 seconds and sometimes hours if we have not done vacuum analyze recenlty.
> Then actual spatial jobs takes less than 10 seconds.
>
> Before I run ogr2ogr I do vacuum analyze
>
>  schemaname |    relname     | n_live_tup | n_dead_tup |        last_autovacuum
> ------------+----------------+------------+------------+-------------------------------
>  pg_catalog | pg_class       |     215296 |       4365 | 2022-10-18 10:24:05.745915+02
>  pg_catalog | pg_attribute   |    1479648 |      18864 | 2022-10-18 12:36:52.820133+02
>  pg_catalog | pg_type        |     200777 |       2318 | 2022-10-18 06:33:58.598257+02
>  pg_catalog | pg_constraint  |      10199 |        104 | 2022-10-20 15:10:57.894674+02
>  pg_catalog | pg_namespace   |        860 |          1 | [NULL]
>  pg_catalog | pg_description |       9119 |          0 | 2022-05-06 01:59:58.664618+02
> (6 rows)
>
> VACUUM ANALYZE pg_catalog.pg_class;
> VACUUM ANALYZE pg_catalog.pg_attribute;
> VACUUM ANALYZE pg_catalog.pg_namespace;
> VACUUM ANALYZE pg_catalog.pg_type;
> VACUUM ANALYZE pg_catalog.pg_constraint;
> VACUUM ANALYZE pg_catalog.pg_description;
>
> After running, we have this values
>
> schemaname |    relname     | n_live_tup | n_dead_tup |        last_autovacuum
> ------------+----------------+------------+------------+-------------------------------
>  pg_catalog | pg_class       |     221739 |        464 | 2022-10-18 10:24:05.745915+02
>  pg_catalog | pg_namespace   |        860 |          2 |
>  pg_catalog | pg_attribute   |    1464900 |       1672 | 2022-10-18 12:36:52.820133+02
>  pg_catalog | pg_constraint  |      10200 |          8 | 2022-10-20 15:10:57.894674+02
>  pg_catalog | pg_type        |     204936 |         93 | 2022-10-18 06:33:58.598257+02
>  pg_catalog | pg_description |       9119 |          0 | 2022-05-06 01:59:58.664618+02
> (6 rows)
>
> Here https://explain.depesz.com/s/oU19#stats the sql generated by ogr2ogr that takes 33 seconds in this sample
> [...]
> ->  Seq Scan on pg_attribute a  (rows=1464751) (actual time=0.028..17740.663
> [...]
> Then we take copy of the pg_catalog tables involved.
>
> Here https://explain.depesz.com/s/NEwB#source is the trace when using the same sql as from ogr2ogr but using the tables in test_pg_metadata and then it runs in 5 seconds.
> [...]
> ->  Seq Scan on pg_attribute a  (rows=1452385) (actual time=0.006..156.392
>
> I do not understand way it's so much slower to use the tables in  pg_catalog than in test_pg_metadata tables because they have the same content.

In both case you have a sequential scan over the pg_attribute table, but for
pg_catalog it takes 17 seconds to retrieve the 1.4M rows, and in the new table
it takes 156 ms.

It looks like you catalog is heavily bloated, which is the cause of the
slowdown.

You could do a VACUUM FULL of the tables in pg_catalog but it would only be a
short term fix as it's likely that your catalog will get bloated again.  Do you
rely a lot on temporary tables?  If yes it can easily lead to this kind of side
effect, and you should modify you code to perform manual vacuum of catalogs
tables very often, or add a dedicated high frequency task for running similar
vacuum and keep the bloat under control.





[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux