On Mon, Oct 10, 2011 at 8:09 AM, Craig Ringer <ringerc@xxxxxxxxxxxxx> wrote: > On 10/07/2011 01:21 AM, Sean Laurent wrote: >> Within a few seconds of the backup, our application servers start >> throwing exceptions that indicate the database connection was closed. >> Meanwhile, Postgres still shows the connections and we start seeing a >> really high number (for us) of locks in the database. The application >> servers refuse to recover and must be killed and restarted. Once they're >> killed off, the connections actually go away and the locks disappear. > > Did you have any luck with this? No, but I have avoided it by simply not using xfs_freeze and snapshotting EBS volumes. Instead I've started taking pg_dumps off the slave database. > This sort of thing sounds a lot like "deadlock" ... but I'm not really sure > how Pg's backends/postmaster could get into a deadlock with each other. It'd > be interesting to look at "wchan" in ps to see what the Pg processes are > waiting on. That's definitely a strong contender. It may be that the xfs_freeze timing was an unrelated problem or even just a coincidence. > Can you reproduce this on a non-EC2 system? Unfortunately, we don't have the hardware resources to test this on a non-EC2 system. -- Sean Laurent Director of Operations StudyBlue, Inc. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general