Hello Everyone!
I did a little bit of digging in the logs. Here is the observation:
Export start timestamp:
"2025-02-19 05:29:16.911 UTC [675181]: [2-1] db=postgres,user=cloudsqladmin LOG: connection authorized: user=cloudsqladmin database=postgres application_name=pg_dump SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256, bits=128)"
The process is: 675181
Locks observed:
Soon after I started noticing:
timestamp:
timestamp:
Waiting processes get cancelled:
In the case, all the waiting pid (663355, 675442, 675407) got cancelled and the messages are:
textPayload:
timestamp:
timestamp:
timestamp:
On Wed, Feb 19, 2025 at 9:38 PM Jeff Janes <jeff.janes@xxxxxxxxx> wrote:
On Wed, Feb 19, 2025 at 10:43 AM Ron Johnson <ronljohnsonjr@xxxxxxxxx> wrote:On Wed, Feb 19, 2025 at 10:00 AM Laurenz Albe <laurenz.albe@xxxxxxxxxxx> wrote:
No, that message is from a cancel request, like when you interrupt your
currently running query with Ctrl+C in "psql" or invoke pg_cancel_backend().
PostgreSQL doesn't do that by itself.The Linux oom killer? I don't remember the exact error message that PG gives to the user, but ISTR that it's "user request”. Had to search through /var/log/messages to see that oomkiller was the culprit.OOM killer kills a process with sig 9. This reboots the entire cluster, and you would get some variant of "server closed the connection unexpectedly" or "terminating connection because of crash of another server process". So not a "user request".Most likely some client (or client library) has an internal timer and cancels its own query after a certain amount of time.I know that JDBC's setQueryTimeout operates this way, it sets a client side timeout which then kicks in to cancel the query by "user request".The server load caused by an export could cause the other queries to run long enough for this logic to kick in, when they otherwise would not.Cheers,Jeff