Hi,
Since this morning our system is running slower than usual. It turns out that some queries take a very long time to plan ( > 1 second). The problem occurs when joining bigger tables. There are no partition for the used tables. The problem has a time correlation with the last autovacuum/autoanalyse this morning, but manual vacuuming and analysing did not fix the problem.
An example explain is:
EXPLAIN ANALYSE
SELECT 1
FROM table_a a
LEFT JOIN table_b bON b.a_id= a.id
WHERE a.object_id=13
AND a.timestamp<'2019-12-06'
AND a.timestamp>'2019-12-03'
SELECT 1
FROM table_a a
LEFT JOIN table_b bON b.a_id= a.id
WHERE a.object_id=13
AND a.timestamp<'2019-12-06'
AND a.timestamp>'2019-12-03'
Nested Loop Left Join (cost=1.28..18137.57 rows=6913 width=4) (actual time=0.043..90.016 rows=14850 loops=1)
-> Index Scan using uq_object_id_timestamp on table_a a (cost=0.70..7038.49 rows=6913 width=8) (actual time=0.028..21.832 rows=14850 loops=1)
Index Cond: ((object_id = 13) AND (timestamp < '2019-12-06'::timestamp with time zone) AND (timestamp > '2019-12-03'::timestamp with time zone))
-> Index Only Scan using table_b_a_id on table_b b (cost=0.57..1.60 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=14850)
Index Cond: (a_id = a.id)
Heap Fetches: 0
Planning time: 1908.550 ms
Execution time: 91.004 ms
-> Index Scan using uq_object_id_timestamp on table_a a (cost=0.70..7038.49 rows=6913 width=8) (actual time=0.028..21.832 rows=14850 loops=1)
Index Cond: ((object_id = 13) AND (timestamp < '2019-12-06'::timestamp with time zone) AND (timestamp > '2019-12-03'::timestamp with time zone))
-> Index Only Scan using table_b_a_id on table_b b (cost=0.57..1.60 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=14850)
Index Cond: (a_id = a.id)
Heap Fetches: 0
Planning time: 1908.550 ms
Execution time: 91.004 ms
The same query on a similar parallel system takes 5ms for planing (PG 9.4.).
Is there a way to detect why the planing is taking this long?
The database is a 9.6.1 with 32GB of shared_buffers, and 1GB of maintanance_work_mem, and machine CPU is below 80% all the time.
Regards,
Mladen Marinović