We recently saw a follower database taken down as the result of maxed out connection slots.
The logs showed that the lock was held by PID 7 and was blocking an AccessShareLock
[9-1] sql_error_code = 00000 LOG: process 5148 still waiting for AccessShareLock on relation 2840 of database 16402 after 1000.103 ms
9-2] sql_error_code = 00000 DETAIL: Process holding the lock: 7. Wait queue: ...
Given that I did not catch the locking in progress I am having to rely on these logs to determine what occurred. The waiting for an AccessShareLock leads me to believe that the actual lock must have been AccessExclusive.
Relation 2840 is a TOAST table whose primary table is `pg_statistic`.
One other piece of potential evidence was that there were auto vacuums occurring on the primary around this time.
Outside of the lock logs there is no logging around what PID7 was doing. Additionally the other followers in the formation did not suffer from the same locking.
Any ideas on what this might be or how we could further troubleshoot this issue?
--
Andy Cooper
The logs showed that the lock was held by PID 7 and was blocking an AccessShareLock
[9-1] sql_error_code = 00000 LOG: process 5148 still waiting for AccessShareLock on relation 2840 of database 16402 after 1000.103 ms
9-2] sql_error_code = 00000 DETAIL: Process holding the lock: 7. Wait queue: ...
Given that I did not catch the locking in progress I am having to rely on these logs to determine what occurred. The waiting for an AccessShareLock leads me to believe that the actual lock must have been AccessExclusive.
Relation 2840 is a TOAST table whose primary table is `pg_statistic`.
One other piece of potential evidence was that there were auto vacuums occurring on the primary around this time.
Outside of the lock logs there is no logging around what PID7 was doing. Additionally the other followers in the formation did not suffer from the same locking.
Any ideas on what this might be or how we could further troubleshoot this issue?
--
Andy Cooper