On 9/23/19 1:58 PM, Corey Taylor wrote:
On Mon, Sep 23, 2019 at 8:57 AM Tom Lane <tgl@xxxxxxxxxxxxx
<mailto:tgl@xxxxxxxxxxxxx>> wrote:
Maybe check for waiting on a lock?
It'd be useful to look in pg_stat_activity and/or top(1) while the
initial query is running, to see if it seems to be eating CPU or
is blocked on some condition.
I think this will provide information that makes it seem less weird.
With your suggestion, I monitored postgres via top and pg_stat_activity
in the various scenarios and found that an autovacuum was triggering
which covered tables used in the insert into select. What seems to be
happening is the autovacuum takes just about the same time as I give the
query to run before giving up on it. The next time I run the query, the
autovacuum is complete and the query runs normally.
Hmm, are there triggers on wss_entries that are UPDATEing/DELETEing
entries elsewhere?
Of course, I'd like to understand why the query never finishes.
when autovacuum is running:
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
38 1 postgres R 158m 8% 1 50% postgres: revwaste
portal-local 172.19.0.5(39956) SELECT
36 1 postgres S 171m 9% 0 0% postgres: autovacuum
worker process portal-local
34 1 postgres S 186m 9% 0 0% postgres: autovacuum
worker process portal-local
after autovacuum finishes and during remaining soft-lock:
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
45 1 postgres R 259m 13% 0 50% postgres: revwaste
portal-local 172.19.0.5(39962) SELECT
20 1 postgres S 153m 8% 1 0% postgres: writer process
22 1 postgres S 153m 8% 0 0% postgres: autovacuum
launcher process
Same for pg_stat_activity:
| backend_start | xact_start |
query_start | state_change |
wait_event_type | wait_event | state | backend_xid | backend_xmin |
query
-------+--------------+-----+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+------------------------------
| 2019-09-23 20:29:45.127527+00 | 2019-09-23 20:29:45.12886+00 |
2019-09-23 20:29:45.12886+00 | 2019-09-23 20:29:45.128861+00 |
| | active | 808 | 808 | select
import_wss()
| 2019-09-23 20:30:01.624853+00 | 2019-09-23 20:30:58.047317+00 |
2019-09-23 20:30:58.047317+00 | 2019-09-23 20:30:58.047318+00 |
| | active | | 808 |
autovacuum: ANALYZE wss.RowCoding
| 2019-09-23 20:31:01.644824+00 | 2019-09-23 20:31:01.666711+00 |
2019-09-23 20:31:01.666711+00 | 2019-09-23 20:31:01.666712+00 |
| | active | | 808 |
autovacuum: ANALYZE wss.WSSData
| 2019-09-23 20:31:14.101808+00 | 2019-09-23 20:31:14.103306+00 |
2019-09-23 20:31:14.103306+00 | 2019-09-23 20:31:14.103307+00 |
| | active | | 808 | select *
from pg_stat_activity
(4 rows)
| backend_start | xact_start |
query_start | state_change |
wait_event_type | wait_event | state | backend_xid | backend_xmin |
query
-------+--------------+-----+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------
| 2019-09-23 20:29:45.127527+00 | 2019-09-23 20:29:45.12886+00 |
2019-09-23 20:29:45.12886+00 | 2019-09-23 20:29:45.128861+00 |
| | active | 808 | 808 | select
import_wss()
| 2019-09-23 20:34:21.01283+00 | 2019-09-23 20:34:21.014473+00 |
2019-09-23 20:34:21.014473+00 | 2019-09-23 20:34:21.014475+00 |
| | active | | 808 | select *
from pg_stat_activity
(2 rows)
Can you create a self-contained test case that acts like this?
I can try to duplicate it if this doesn't provide the details needed.
corey
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx