How to check if session is a hung thread/session

Edwin UY <edwin.uy@xxxxxxxxx> · Sun, 10 Nov 2024 10:25:42 +1300

Hi,

I am not sure how to explain this, but I believe this is something usually referred to in the JDBC world as a hung thread.

This is an Aurora RDS PostgreSQL database that was patched in a supposed to be zero downtime kind of thing.
After the patching, I have a user complaining that he has a job that normally gets run in the OS cron session that has been failing.

When I check the database, it 

=> SELECT pg_postmaster_start_time(),
->        pg_postmaster_start_time() at time zone 'utc' at time zone 'Pacific/Auckland' as "Pacific/Auckland TIMEZONE",
->        date_trunc( 'second',  current_timestamp - pg_postmaster_start_time() ) as uptime;
-[ RECORD 1 ]-------------+------------------------------
pg_postmaster_start_time  | 2024-11-06 10:44:51.832663+00
Pacific/Auckland TIMEZONE | 2024-11-05 21:44:51.832663+00
uptime                    | 3 days 10:29:08

So, in theory it really wasn't zero downtime, not sure Aurora RDS takes care of it but from the output above, it was re-started although pg_stat_activity shows it is still there.

Checking pg_stat_activity

select pid as process_id,
       usename as username,
       datname as database_name,
       client_addr as client_address,
       application_name,
       now() - pg_stat_activity.query_start AS duration,
       backend_start,
       backend_start at time zone 'Pacific/Auckland' as backend_start_localtime,
       state,
       state_change,
       state_change at time zone 'Pacific/Auckland' as state_change_localtime
from pg_stat_activity
where usename = '[blah]'
order by username asc

I have the output as below:

 process_id | username  | database_name | client_address |    application_name    |    duration     |         backend_start         |  backend_start_localtime   | state |         state_change          |   state_change_localtime
------------+-----------+---------------+----------------+------------------------+-----------------+-------------------------------+----------------------------+-------+-------------------------------+----------------------------
      31947 | [blah] | [blah]         | [blah]    | PostgreSQL JDBC Driver | 00:47:21.838892 | 2024-11-06 10:44:53.309388+00 | 2024-11-06 23:44:53.309388 | idle  | 2024-11-08 02:00:06.005173+00 | 2024-11-08 15:00:06.005173
(1 row)

From the pg_stat_output, the backend_start has not changed for several days since the patching, so I am 'guessing' it may have gone lost/rogue already. Is there any way to check that this is the case before I kill it.
The state_change date is getting updated though, I don't know if this is proof that it is not a hung thread.

Any advice much appreciated. Thanks in advance.

Regards,
Ed