Hi,
I'm using AWS RDS Postgres (9.6.6) and have run into very slow shutdowns (10+ minutes) a few times when making database modifications (e.g. reboot, changing instance size, etc.). Other times, it shuts down quickly (1 minute or so). I have not been able to figure out why sometimes it takes a long time to shutdown.
When it happens, I see a bunch of lines in the postgres log like the following over and over (almost every second or two) during this 10 minute shutdown period:
2018-09-12 06:37:01 UTC:XXX.XXX.XXX.XXX(19712):my_user@my_db:[16495]:FATAL:
2018-09-12 06:37:01 UTC:localhost(31368):rdsadmin@rdsadmin:[16488]:FATAL: the database system is shutting down
Once I start seeing these messages, I start manually shutting down all of our applications that are connected to the db. I'm not sure if shutting down the apps fixes it or if there's some timeout on the RDS side, but it seems like once I start doing this, the database finally shuts down.
When it takes this long to shut down, it ends up causing a lot more downtime than I would like. I've tried asking AWS's support why it takes so long to shutdown sometimes, but they basically just told me that's "how it works" and that I should try to shut down all of my connections ahead of time before making database modifications.
We just have a few ruby on rails applications connected to the database, and don't really have any long running or heavy queries and the db is under very light load, so I don't understand why it takes so long to shutdown. We do have a sizeable number of connections though (about 600) and there are two replicas connected to it. I also tried setting idle_in_transaction_session_timeout to 300 seconds to see if that would help, but it made no difference.
I was wondering if anyone else had seen this behavior on their RDS Postgres instances or had any suggestions on how I could shorten the shutdown time?
Thanks,
Chris