Hi all,
if we were on Oracle i'd suggest /*+ rule */ to disable bad cost-based optimizer optimisation, but my knowlegde is very old regarding
Oracle.
Do optimizer hints in PG exists?
Obviously not:
https://wiki.postgresql.org/wiki/OptimizerHintsDiscussion
My next guess would be: make several steps; not a single select but: create table … select … drop table;
In my experience "in" or "not in" are worst time killers. Try to avoid that.
Ist even worth a try to
create table ttemp … insert
'pg_catalog', 'information_schema'
Create index ... on ttemp …
WHERE nspname NOT IN (select .. from ttemp);
This may sound crazy but doing such experiments saved me a lot of execution time with Oracle and MySQL
best,
Anton
Von: liam saffioti <liam.saffiotti@xxxxxxxxx>
Gesendet: Mittwoch, 8. Dezember 2021 09:35
An: Dischner, Anton <Anton.Dischner@xxxxxxxxxxxxxxxxxxx>
Cc: pgsql-admin <pgsql-admin@xxxxxxxxxxxxxxxxxxxx>; Julien Rouhaud <rjuju123@xxxxxxxxx>
Betreff: Re: postgresql long running query
Hi again,
I created additional index for just one column. (CREATE INDEX ON "T_CMN_SLAHISTORY" ("DefinitionId");
The query now runs faster. But I don't think that's my problem.
My query takes a very long time sometimes, not always. I could not understand the reason for this.
I mean, why is it taking longer at that time for example?
My another example is here:
SELECT nspname AS schema_name, relname AS table_name, pg_total_relation_size(C.oid) AS table_size FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND pg_total_relation_size(C.oid)
> 100000000 AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC LIMIT 30;
This query takes 800ms to run. But this query took 18 minutes last night. I don't understand why the query that normally always takes a short time is taking so long at this time. Also there are no locks.
2021-12-08 04:23:48.235 [local] surface postgres 508411 SELECT LOG: duration: 1124822.358 ms statement: SELECT nspname as schema_name, relname AS table_name, pg_total_relation_size (C .oid) AS table_size FROM pg_class
C LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE nspname NOT IN ( 'pg_catalog', 'information_schema' ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size (C
.oid) DESC LIMIT 30
2021-12-08 04:23:48.227 [local] surface postgres 524884 SELECT LOG: duration: 525254.222 ms statement: SELECT nspname as schema_name, relname AS table_name, pg_total_relation_size (C .oid) AS table_size FROM pg_class
C LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE nspname NOT IN ( 'pg_catalog', 'information_schema' ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size (C
.oid) DESC LIMIT 30
2021-12-08 04:23:48.235 [local] surface postgres 516992 SELECT LOG:
duration: 824718.012 ms statement: SELECT nspname as schema_name, relname AS table_name, pg_total_relation_size (C .oid) AS table_size FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE nspname NOT IN ( 'pg_catalog',
'information_schema' ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size (C .oid) DESC LIMIT 30
Finally, let me point out; the execution plan is here
EXPLAIN ANALYZE SELECT nspname AS schema_name, relname AS table_name, pg_total_relation_size(C.oid) AS table_size FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND pg_total_relation_size(C.oid) > 100000000 AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC LIMIT 30;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=34231.14..34231.22 rows=30 width=136) (actual time=837.358..837.367 rows=30 loops=1)
-> Sort (cost=34231.14..34233.24 rows=840 width=136) (actual time=837.356..837.362 rows=30 loops=1)
Sort Key: (pg_total_relation_size((c.oid)::regclass)) DESC
Sort Method: top-N heapsort Memory: 38kB
-> Hash Join (cost=204.31..34206.33 rows=840 width=136) (actual time=70.550..837.181 rows=161 loops=1)
Hash Cond: (c.relnamespace = n.oid)
-> Seq Scan on pg_class c (cost=0.00..33995.56 rows=1664 width=72) (actual time=0.246..801.685 rows=202 loops=1)
Filter: ((relkind <> 'i'::"char") AND (pg_total_relation_size((oid)::regclass) > 100000000))
Rows Removed by Filter: 9022
-> Hash (cost=166.91..166.91 rows=2992 width=68) (actual time=15.494..15.494 rows=3001 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 326kB
-> Seq Scan on pg_namespace n (cost=0.00..166.91 rows=2992 width=68) (actual time=0.026..14.616 rows=3001 loops=1)
Filter: ((nspname <> ALL ('{pg_catalog,information_schema}'::name[])) AND (nspname !~ '^pg_toast'::text))
Rows Removed by Filter: 2926
Planning Time: 2.627 ms
Execution Time: 837.660 ms
(16 rows)
Thank you so much for your help.
Hi all,
please report your findings trying to use a new index special for DefinitionId as suggested by Edward.
We recently found for our data:
By adding indexes the running times of queries were reduced from days to several seconds.
Don’t be scared: you can always remove indexes by dropping them!
best,
Anton
Von: liam saffioti <liam.saffiotti@xxxxxxxxx>
Gesendet: Freitag, 3. Dezember 2021 14:12
An: Dischner, Anton <Anton.Dischner@xxxxxxxxxxxxxxxxxxx>
Cc: pgsql-admin <pgsql-admin@xxxxxxxxxxxxxxxxxxxx>; Julien Rouhaud <rjuju123@xxxxxxxxx>
Betreff: Re: postgresql long running query
Hi Anton and Samed,
Thank you for your reply.
I use PostgreSQL 12.9 on RHEL 8.4.
There are no DDL queries on the database in a whole day, so there is no lock state.
The pgbadger slow queries report shows this query ran for 43min 11sec.
And table and indexes definition is:
\d+ "T_CMN_SLAHISTORY"
Table "T_CMN_SLAHISTORY"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
------------------+-----------------------------+-----------+----------+----------------------------------+----------+--------------+-------------
HistoryId | integer | | not null | generated by default as identity | plain | |
DefinitionId | integer | | not null | | plain | |
RuleName | character varying(200) | | not null | | extended | |
CreateUserId | integer | | not null | | plain | |
CreateUserTime | timestamp without time zone | | not null | | plain | |
LastRunningTime | timestamp without time zone | | | | plain | |
Cost | integer | | | | plain | |
UpdateUserId | integer | | | | plain | |
UpdateUserTime | timestamp without time zone | | | | plain | |
RecId | character varying(36) | | | | extended | |
MyTimeStamp | bytea | | | | extended | |
IsOldRecordBatch | boolean | | | | plain | |
InstanceId | bigint | | | | plain | |
Indexes:
"T_CMN_SLAHISTORY_HistoryId_idx" UNIQUE, btree ("HistoryId")
"Index-20180207-152712" btree ("RuleName", "DefinitionId")
"Index-20180712-192739" btree ("DefinitionId", "IsOldRecordBatch")
"Index-20180712-192836" btree ("InstanceId")
Access method: heap
Schema | Name | Type | Owner | Size | Description
---------+------------------+-------+----------+--------+-------------
public | T_CMN_SLAHISTORY | table | postgres | 244 MB |
Hi Liam,
top tuning tip for every relational database is: indexes!
I have adminstrated databases where data was 300 GB and created indexes 600 GB
You may think thats useless redundant but ist definitively not.
Rules for indexing:
Does it make sense to create an index for a table with one or several douzends of entries? -> YES!
Does it make sense to create an index that is optimized for certain queries? -> YES!
Does it make sense to create an index, make the query and drop the index? -> YES
Even with tables with billions of entries you should get answer times of less than a second!
… if you have a good index: date for example where the database engine plays ist full potental.
For example see:
https://www.enterprisedb.com/postgres-tutorials/overview-postgresql-indexes
Which indexes do you have?
best,
A
Von: liam saffioti <liam.saffiotti@xxxxxxxxx>
Gesendet: Freitag, 3. Dezember 2021 12:24
An: pgsql-admin <pgsql-admin@xxxxxxxxxxxxxxxxxxxx>;
Julien Rouhaud <rjuju123@xxxxxxxxx>
Betreff: postgresql long running query
Hello Team,
I have a problem with a query that consumes a long time.
The query' execution plan is :
EXPLAIN ANALYZE SELECT x."HistoryId", x."SlaDefinition" FROM "T_CMN_SLAHISTORY" AS x WHERE x."DefinitionId" = '302';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on "T_CMN_SLAHISTORY" x (cost=2144.77..35086.42 rows=139012 width=8) (actual time=58.806..116.874 rows=135498 loops=1)
Recheck Cond: ("DefinitionId" = 302)
Heap Blocks: exact=2175
-> Bitmap Index Scan on "Index-20180712-192739" (cost=0.00..2110.02 rows=139012 width=0) (actual time=57.043..57.045 rows=135498 loops=1)
Index Cond: ("DefinitionId" = 302)
Planning Time: 11.132 ms
Execution Time: 120.320 ms
But, the query execution time was 43min 11seconds in the morning in pgbadger report. I don't understand why the query is taking so long. Can you guide me?
|