Search Postgresql Archives

Re: Help making tablespaces work for my application

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/23/2011 09:37 AM, Natusch, Paul wrote:

I have an application for which data is being written to many disks simultaneously.  I would like to use a postgres table space on each disk.   If one of the disks crashes it is tolerable to lose that data, however, I must continue to write to the other disks.


Tablespaces are not useful for implementing this idea yet.


    1.       There is a single WAL log for the entire cluster, located in the pg_log subdirectory. If the disk containing the pg_log file crashed, does that mean the system would come to a halt.  Is there anyway to distribute this data so that WAL is located on the same media as the table space? An alternative would be to use raid with the disk that stores the pg_log subdirectory but that adds cost to the system.


Loss of the pg_xlog subdirectory and the WAL contained in it normally results in catastrophic database failure.  Recommended practice is to use a RAID-1 volume to make odds of that failure lower.
 

2.        If #1 was solved by using the raid approach, what happens if one of the disks containing one of my table spaces crashes.  At some point postgres will want to write the data from the WAL file to the crashed (unavailable) disk.    Will postgres will be blocked at this point?  Is there some way to notify postgres that a specific disk is no longer available and that the entries in the WAL for this disk should either be purged or ignored? ( I’m willing to “throw away” the data on the crashed disk).  


PostgreSQL can't be expected to operate sanely when faced with the loss of an individual tablespace.  It may be possible to recover from it, but you'll be doing something it's not designed to handle, and that effort may not succeed.

Note that any tablespace failure is likely to require taking down the database to repair the involved tablespaces, so you're likely to have downtime between a component failure and when you notice to take action.  The database really does not like having tablespaces just go away in the middle of operations.

PostgreSQL 9.1 (not released yet, currently in beta) includes a new feature called "unlogged tables" that might make this sort of deployment possible.  If you created a tablespace for disposable data and put an unlogged table onto it, loss of that tablespace would me much less likely to cause a problem.  So long as you recreated a new space for the unlogged table after restarting, you could probably recover having only lost the data on the crashed disk in this situation.


Clearly using raid on all of the disks would be a solution, but that is cost prohibitive.



On a cheap server I can easily RAID-1 mirror a pair of drives on Linux using software RAID, and individual drives are $50 to $100 each.  If your data isn't worth that much, And even that's not enough to really make me feel secure about the data--you really need to keep another copy around as a backup, too.  You can treat your data as disposable and expect to lose it when any single component fails, or you can include some good redundancy practices in the design to reduce odds of a failure.  There really isn't really a good solution providing partial protection in the middle of those two.

-- 
Greg Smith   2ndQuadrant US    greg@xxxxxxxxxxxxxxx   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux