Took some time to get stack trace as we didn't had root permission. Attaching stack trace of two process (out of many) stuck for same query below[1][2]
Seems like call is unable to come out of this loop :
https://github.com/postgres/postgres/blob/master/src/backend/postmaster/pgstat.c#L3361-L3400
https://github.com/postgres/postgres/blob/master/src/backend/postmaster/pgstat.c#L3361-L3400
All stack trace's top function points to somewhere in this loop so it means calls are not stuck, but are unable to come out of this loop.
[1]
[12:43:30][root][~]$ pstack 6283
#0 pgstat_read_current_status () at pgstat.c:3495
#1 0x0000000000732381 in pgstat_read_current_status () at pgstat.c:2566
#2 pgstat_fetch_stat_numbackends () at pgstat.c:2567
#3 0x000000000083bfef in pg_stat_get_activity (fcinfo=0x7ffd26955b80) at pgstatfuncs.c:581
#4 0x00000000006832a1 in ExecMakeTableFunctionResult (setexpr=0x14ea907dcc60, econtext=0x14ea907dca50, argContext=<optimized out>, expectedDesc=0x14ea907df048, randomAccess=0 '\000') at execSRF.c:231
#5 0x000000000068e7b3 in FunctionNext (node=node@entry=0x14ea907dc298) at nodeFunctionscan.c:94
#6 0x000000000068275a in ExecScanFetch (recheckMtd=0x68e4e0 <FunctionRecheck>, accessMtd=0x68e500 <FunctionNext>, node=0x14ea907dc298) at execScan.c:97
#7 ExecScan (node=0x14ea907dc298, accessMtd=0x68e500 <FunctionNext>, recheckMtd=0x68e4e0 <FunctionRecheck>) at execScan.c:147
#8 0x0000000000688009 in ExecProcNode (node=0x14ea907dc298) at ../../../src/include/executor/executor.h:250
#9 fetch_input_tuple (aggstate=aggstate@entry=0x14ea907dc4f8) at nodeAgg.c:695
#10 0x000000000068a0af in agg_retrieve_direct (aggstate=0x14ea907dc4f8) at nodeAgg.c:2347
#11 ExecAgg (pstate=0x14ea907dc4f8) at nodeAgg.c:2158
#12 0x000000000067cce2 in ExecProcNode (node=0x14ea907dc4f8) at ../../../src/include/executor/executor.h:250
#13 ExecutePlan (execute_once=<optimized out>, dest=0x14ea907a1190, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14ea907dc4f8, estate=0x14ea907dc038) at execMain.c:1723
#14 standard_ExecutorRun (queryDesc=0x14ea906cc038, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364
#15 0x000014ed56cee425 in pgss_ExecutorRun (queryDesc=0x14ea906cc038, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:891
#16 0x000014ed56cd760e in explain_ExecutorRun (queryDesc=0x14ea906cc038, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at auto_explain.c:267
#17 0x00000000007b328c in PortalRunSelect (portal=portal@entry=0x14ea907da038, forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, dest=dest@entry=0x14ea907a1190) at pquery.c:932
#18 0x00000000007b4630 in PortalRun (portal=portal@entry=0x14ea907da038, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', run_once=run_once@entry=1 '\001', dest=dest@entry=0x14ea907a1190, altdest=altdest@entry=0x14ea907a1190, completionTag=0x7ffd26956530 "") at pquery.c:773
#19 0x00000000007b0223 in exec_simple_query (query_string=0x14ed51d1f038 "select count(*) from pg_stat_activity;") at postgres.c:1145
#20 0x00000000007b2388 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x14ed51dd42a8, dbname=0x14ed51dd4158 "db_name", username=<optimized out>) at postgres.c:4235
#21 0x00000000004cf2ae in BackendRun (port=0x14ed51dfa380) at postmaster.c:4791
#22 BackendStartup (port=0x14ed51dfa380) at postmaster.c:4458
#23 ServerLoop () at postmaster.c:1930
#24 0x0000000000739d58 in PostmasterMain (argc=argc@entry=9, argv=argv@entry=0x14ed51c246f0) at postmaster.c:1557
#25 0x00000000004d1594 in main (argc=9, argv=0x14ed51c246f0) at main.c:228
[2]
[14:53:36][root][~]$ pstack 82504
#0 0x000000000072e053 in pgstat_read_current_status () at pgstat.c:3467
#1 0x0000000000732381 in pgstat_read_current_status () at pgstat.c:2566
#2 pgstat_fetch_stat_numbackends () at pgstat.c:2567
#3 0x000000000083bfef in pg_stat_get_activity (fcinfo=0x7ffd26955c30) at pgstatfuncs.c:581
#4 0x00000000006832a1 in ExecMakeTableFunctionResult (setexpr=0x14ea906e95b0, econtext=0x14ea906e8a50, argContext=<optimized out>, expectedDesc=0x14ea906eb958, randomAccess=0 '\000') at execSRF.c:231
#5 0x000000000068e7b3 in FunctionNext (node=node@entry=0x14ea906e8298) at nodeFunctionscan.c:94
#6 0x00000000006826e7 in ExecScanFetch (recheckMtd=0x68e4e0 <FunctionRecheck>, accessMtd=0x68e500 <FunctionNext>, node=0x14ea906e8298) at execScan.c:97
#7 ExecScan (node=0x14ea906e8298, accessMtd=0x68e500 <FunctionNext>, recheckMtd=0x68e4e0 <FunctionRecheck>) at execScan.c:164
#8 0x0000000000688009 in ExecProcNode (node=0x14ea906e8298) at ../../../src/include/executor/executor.h:250
#9 fetch_input_tuple (aggstate=aggstate@entry=0x14ea906e84f8) at nodeAgg.c:695
#10 0x000000000068a0af in agg_retrieve_direct (aggstate=0x14ea906e84f8) at nodeAgg.c:2347
#11 ExecAgg (pstate=0x14ea906e84f8) at nodeAgg.c:2158
#12 0x000000000067cce2 in ExecProcNode (node=0x14ea906e84f8) at ../../../src/include/executor/executor.h:250
#13 ExecutePlan (execute_once=<optimized out>, dest=0x14ed51d1f448, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14ea906e84f8, estate=0x14ea906e8038) at execMain.c:1723
#14 standard_ExecutorRun (queryDesc=0x14ed51d854b8, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364
#15 0x000014ed56cee425 in pgss_ExecutorRun (queryDesc=0x14ed51d854b8, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:891
#16 0x000014ed56cd760e in explain_ExecutorRun (queryDesc=0x14ed51d854b8, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at auto_explain.c:267
#17 0x00000000007b328c in PortalRunSelect (portal=portal@entry=0x14ed51e08038, forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, dest=dest@entry=0x14ed51d1f448) at pquery.c:932
#18 0x00000000007b4630 in PortalRun (portal=portal@entry=0x14ed51e08038, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', run_once=<optimized out>, dest=dest@entry=0x14ed51d1f448, altdest=altdest@entry=0x14ed51d1f448, completionTag=0x7ffd26956730 "") at pquery.c:773
#19 0x00000000007b200c in exec_execute_message (max_rows=9223372036854775807, portal_name=0x14ed51d1f038 "") at postgres.c:2030
#20 PostgresMain (argc=<optimized out>, argv=argv@entry=0x14ed51d6d088, dbname=0x14ed51d6d068 "db_name", username=<optimized out>) at postgres.c:4298
#21 0x00000000004cf2ae in BackendRun (port=0x14ed51dfa380) at postmaster.c:4791
#22 BackendStartup (port=0x14ed51dfa380) at postmaster.c:4458
#23 ServerLoop () at postmaster.c:1930
#24 0x0000000000739d58 in PostmasterMain (argc=argc@entry=9, argv=argv@entry=0x14ed51c246f0) at postmaster.c:1557
#25 0x00000000004d1594 in main (argc=9, argv=0x14ed51c246f0) at main.c:228
[14:53:43][root][~]$
On Mon, May 6, 2019 at 2:33 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
neeraj kumar <neeru.cse@xxxxxxxxx> writes:
> We are using PG 10.6. We have one cron job that queries pg_stat_activity
> table to find out how many queries are running longer than X minutes and
> generate metrics.
> Query look like this :
> SELECT * FROM pg_stat_activity WHERE state='active'
> After some days, this query get stuck and doesn't finish. We tried to run
> this query manually and same result.
> We looked into pg_locks table and there this query is not blocked on any
> lock :
> https://justpaste.it/48rpe
Interesting. Can you get a stack trace to show where in the code it's
stuck?
https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
regards, tom lane
-------------------------------------
Thanks
Neeraj Kumar,
+1 (206) 427-7267
Thanks
Neeraj Kumar,
+1 (206) 427-7267