Re: Lost rows/data corruption?

Scott Marlowe <smarlowe@xxxxxxxxxxxxxxxxx> · Wed, 16 Feb 2005 08:18:01 -0600

On Wed, 2005-02-16 at 07:14, Alban Hertroys wrote:
> Marco Colombo wrote:
> > On Wed, 16 Feb 2005, Andrew Hall wrote:
> > 
> >> fsync is on for all these boxes. Our customers run their own hardware 
> >> with many different specification of hardware in use. Many of our 
> >> customers don't have UPS, although their power is probably pretty 
> >> reliable (normal city based utilities), but of course I can't 
> >> guarantee they don't get an outage once in a while with a thunderstorm 
> >> etc.
> > 
> > 
> > I see. Well I can't help much, then, I don't run PG on XFS. I suggest 
> > testing
> > on a different FS, to exclude XFS problems. But with fsync on, the FS has
> > very little to do with reliability, unless it _lies_ about fsync(). Any
> > FS should return from fsync only after data is on disc, journal or not
> > (there might be issues with meta-data, but it's hardly a problem with PG).
> > 
> > It's more likely the hardware (IDE disks) lies about data being on plate.
> > But again that's only in case of sudden poweroffs.
> 
> Do you happen to have the same type disks in all these systems? That 
> could point to a disk cache "problem" (f.e. the disks lying about having 
> written data from the cache to disk).
> 
> Or do you use the same disk parameters on all these machines? Have you 
> tried using the disks w/o write caching and/or in synchronous mode 
> (contrary to "async").

I was wondering if this problem had ever shown up on a machine that
HADN'T lost power abrubtly or not.  IFF the only machines that
experience corruption have lost power beforehand sometime, then I would
look towards either the drives, controller or file system or somewhere
in there.

I know there are write modes in ext3 that will allow corruption on power
loss (I think it's writeback).  I know little of XFS in a production
environment, as I run ext3, warts and all.

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org