On 10/12/07, Jack Orenstein <jack.orenstein@xxxxxxx> wrote: > Our testing involves cutting power to machines running postgres 7.4, > while under load (lots of reads and writes). When we do this, we often > lose some files under pg_data storing table content. I.e., the file > named for a table's pg_class.oid value is simply gone. This can affect > many tables following a power outage. We know this problem has > occurred when we are unable to access a table, e.g. > > ERROR: relation "xxx" does not exist > > The table is present in the system tables, but the file storing table > content under pg_data is missing. > > Can anyone provide insight on this phenomenon? Why are these files > lost? Are they really lost, or have they simply moved somewhere? What > happens to the disk blocks formerly occupied by the files? > > Getting back in service following this file loss is not a problem; I'm > just trying to understand how postgres gets into this state. First of all, this should not happen on a machine with proper fsyncing. The possible causes are generally either fsync is off in postgresql.conf or the drive array <--> OS layer is lying about fsync operations. The most common hardware cause is IDE / SATA drives / controllers that do not enforce fsync, but instead fsync when the data is written to drive / controller buffer memory and continue on. On IDE / SATA drives you can often fix this by turning off the cache. The best way to work with this is to get a known reliable battery backed caching RAID controller which will make the system fast as well as reliable. LSI, Areca, and Escalade are all known to make reliable controllers. Late model Adaptecs have gotten some good reports also, but their earlier controllers were terrible. ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match