On 19 April 2018 at 14:40, Wells Oliver <wells.oliver@xxxxxxxxx> wrote:
Had an issue tonight where I had a bunch of stalled queries from a client connection and I just... could... not... kill... them. We disconnected the client machine, turned it off, picked it up, shook it around, yelled at it, and still these idle queries remained in pg_stat_activity.Then I did select pg_cancel_backend(pid) from pg_stat_activity where client_addr = '..' and they just would... not... go.. away.So me being the big smart system administrator guy with shell access, I logged in, and did a kill -9 xxx where xxx was the sme pid from the pg_stat_activity result and... they finally went away!Felt good about myself until I realized, well, so did every other connection, and in fact PG momentarily went into recovery mode.Everything was fine, but a) why is it a bad idea to kill -9 a client PG process, but pg_cancel_backend() is OK-- and b) what to do about stalled PG queries that won't die when you disconnect AND when you pg_cancel_backend() them?
Not sure about postgres specific reasons not to kill -9, but from a more general perspective, kill -9 should only be used once all other more 'polite' kill requests have been attempted. For example, kill -TERM The problem with kill -9 is that it is a hard, non-catchable kill signal - this means there is no opportunity for the process to handle the request and cleanup before it quits. This can result in memory not being released or being corrupted as well as other resources not being released. Rather than going straight for kill -9, at least try kill -15 (TERM) first. If that doesn't work, then pull out the big gun.
-- regards,
Tim
--
Tim Cross