My apologies, After posting this and looking at how it appears I realised that line wrapping makes the tables totally illegible. Here they are again with all unnecessary columns removed and others shortened. locktype | database | relation | pid | mode | granted | waitstart | relation ----------+----------+------------+---------+---------------------+---------+-------------------------------+-------------------- relation | 529986 | 1842228045 | 2130531 | AccessShareLock | f | 2022-01-19 00:32:32.626152+01 | st.ctr_table_efr_oa relation | 529986 | 1842228045 | 2128603 | AccessExclusiveLock | t | | st.ctr_table_efr_oa relation | 529986 | 1842231489 | 2128603 | AccessExclusiveLock | f | 2022-01-19 00:32:32.924694+01 | st.tpd_oa relation | 529986 | 1842231489 | 2130531 | AccessShareLock | t | | st.tpd_oa pid | query_start | state_change | wait_event_type | wait_event | query ---------+--------------------+--------------------+-----------------+------------+------------------------------------------------------------- 2128603 | 00:32:32.924413+01 | 00:32:32.924413+01 | Lock | relation | DROP TABLE st.tpd_oa_18929 2130531 | 00:32:32.625706+01 | 00:32:32.625708+01 | Lock | relation | DELETE FROM st.ctr_table_efr_oa_19010 WHERE ropid = 44788868 I hope this version is easier to read. Cheers Mike On Tue, 25 Jan 2022 at 15:49, Michael Harris <harmic@xxxxxxxxx> wrote: > > Hello Experts > > I'm hoping you will be able to help me with a tricky issue. > > We've recently updated our application to PG 14.1, and in the test instance we > have started to see some alarming undetected deadlocks. > > An example of what we have seen is: > > locktype | database | relation | page | tuple | virtualxid | > transactionid | classid | objid | objsubid | virtualtransaction | > pid | mode | granted | fastpath | waitstart > | relation > ----------+----------+------------+------+-------+------------+---------------+---------+-------+----------+--------------------+---------+-----------------+---------+----------+-------------------------------+-------------------------------- > relation | 529986 | 1842228045 | | | | > | | | | 165/1941408 | 2130531 | > AccessShareLock | f | f | 2022-01-19 00:32:32.626152+01 | > st.ctr_table_efr_oa > (1 row) > > locktype | database | relation | page | tuple | virtualxid | > transactionid | classid | objid | objsubid | virtualtransaction | > pid | mode | granted | fastpath | waitstart | > relation > ----------+----------+------------+------+-------+------------+---------------+---------+-------+----------+--------------------+---------+---------------------+---------+----------+-----------+-------------------------------- > relation | 529986 | 1842228045 | | | | > | | | | 75/2193719 | 2128603 | > AccessExclusiveLock | t | f | | > st.ctr_table_efr_oa > (1 row) > > locktype | database | relation | page | tuple | virtualxid | > transactionid | classid | objid | objsubid | virtualtransaction | > pid | mode | granted | fastpath | waitstart > | relation > ----------+----------+------------+------+-------+------------+---------------+---------+-------+----------+--------------------+---------+---------------------+---------+----------+-------------------------------+----------- > relation | 529986 | 1842231489 | | | | > | | | | 75/2193719 | 2128603 | > AccessExclusiveLock | f | f | 2022-01-19 > 00:32:32.924694+01 | st.tpd_oa > (1 row) > > locktype | database | relation | page | tuple | virtualxid | > transactionid | classid | objid | objsubid | virtualtransaction | > pid | mode | granted | fastpath | > waitstart | relation > ---------------+----------+------------+------+-------+--------------+---------------+---------+-----------+----------+--------------------+---------+-----------------------+---------+----------+-------------------------------+----------- > relation | 529986 | 1842231489 | | | | > | | | | 165/1941408 | > 2130531 | AccessShareLock | t | f | > | st.tpd_oa > > So: > pid 2130531 waits for an AccessShareLock on relation 1842228045, > blocked by pid 2128603 which holds an AccessExclusiveLock > pid 2128603 waits for an AccessExclusiveLock on relation 1842231489, > blocked by pid 2130531 which holds an AccessShareLock > > The queries being executed by these backends are: > > pid | query_start | state_change > | wait_event_type | wait_event | state | > query > ---------+-------------------------------+-------------------------------+-----------------+------------+--------+------------------------------------------------------------------------- > 2128603 | 2022-01-19 00:32:32.924413+01 | 2022-01-19 > 00:32:32.924413+01 | Lock | relation | active | DROP > TABLE st.tpd_oa_18929 > 2130531 | 2022-01-19 00:32:32.625706+01 | 2022-01-19 > 00:32:32.625708+01 | Lock | relation | active | DELETE > FROM st.ctr_table_efr_oa_19010 WHERE ropid = 44788868 > (2 rows) > > Both of these processes had been waiting for nearly 2 hours when I > took the above printouts. My understanding of deadlock detection is > that when a backend is waiting for a lock, it runs a deadlock > detection function once a second so it should have been detected > straight away but somehow it is not. > > Are there any deadlock conditions that postgresql is not able to detect? > > Note that there were a lot of other processes also waiting on relation > 1842231489 - could that be confusing the deadlock detection routine? > > I am also confused about the locks which are being taken out by the > DELETE query. Process 2130531 is trying to delete rows from a > partition `st.ctr_table_efr_oa_19010`, and to do so it is trying to > get an AccessShareLock on the parent table `st.ctr_table_efr_oa`. I > don't really understand why that is, and in fact if I try to reproduce > the deadlock manually by executing the relevant SQL in psql, the > DELETE never tries to take out such a lock and no deadlock occurs. > > What are the circumstances where deleting rows directly from a > partition will cause it to take a lock on the partitioned table? > > Any suggestions for debugging this issue would be most welcome. > > Thanks in advance > > Cheers > Mike