Search Postgresql Archives

Re: Postgres 9.01, Amazon EC2/EBS, XFS, JDBC and lost connections

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/06/11 10:21 AM, Sean Laurent wrote:
We've been running into a particularly strange problem that I'm trying to better understand. The super short version is that our application servers lose their connection to the database when I run a backup during periods of higher load and fail to reconnect.

Here's an overview of the setup:

- PostgreSQL 9.0.1 hosted on a cc1.4xlarge Amazon EC2 instance running CentOS 5.6
- 8 disk RAID-0 array of EBS volumes used for primary data storage
- 4 disk RAID-0 array of EBS volumes used for transaction logs
- Root partition is ext3
- RAID arrays are xfs

Backups are taken using a script that runs the following workflow:

- Tell Postgres to start a backup: SELECT pg_start_backup('RAID backup');
- Run "xfs_freeze" on the primary RAID array
- Tell Amazon to take snapshots of each of the EBS volumes
- Run "xfs_freeze -u" to thaw the primary RAID array
- Run "xfs_freeze" on the transaction log RAID array
- Tell Amazon to take snapshots of each of the EBS volumes
- Run "xfs_freeze -u" to thaw the transaction log RAID array
- Tell Postgres the backup is finished: SELECT pg_stop_backup();
- Remove old WAL files

The whole process takes roughly 7 seconds on average. The RAID arrays are frozen for roughly 2 seconds on average.


While xfs_freeze is in effect, all writes are blocked. This is NOT what you want to do here, postgres does NOT expect you to take an atomic snapshot of the database files, rather, by bracketing your backup with pg_start_backup and pg_stop_backup, it puts things in a state where a file by file backup will be fine.

from the man pages...

   xfs_freeze halts new access to the filesystem and creates a stable
   image on disk. xfs_freeze is intended to be used with volume
   managers and hardware RAID devices that support the creation of
   snapshots.

   The mount-point argument is the pathname of the directory where the
   filesystem is mounted. The filesystem must be mounted to be frozen
   (see mount <http://linux.die.net/man/8/mount>(8)).

   The -f flag requests the specified XFS filesystem to be frozen from
   new modifications. When this is selected, all ongoing transactions
   in the filesystem are allowed to complete, new write system calls
   are halted, other calls which modify the filesystem are halted, and
   all dirty data, metadata, and log information are written to disk.
   Any process attempting to write to the frozen filesystem will block
   waiting for the filesystem to be unfrozen.


when postgres's writer processes block, I suspect things go sour fast.




--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux