Re: Lost rows/data corruption?

Marco Colombo <pgsql@xxxxxxxxxx> · Wed, 16 Feb 2005 13:46:55 +0100 (CET)

On Wed, 16 Feb 2005, Andrew Hall wrote:

fsync is on for all these boxes. Our customers run their own hardware with 
many different specification of hardware in use. Many of our customers don't 
have UPS, although their power is probably pretty reliable (normal city based 
utilities), but of course I can't guarantee they don't get an outage once in 
a while with a thunderstorm etc.

I see. Well I can't help much, then, I don't run PG on XFS. I suggest testing
on a different FS, to exclude XFS problems. But with fsync on, the FS has
very little to do with reliability, unless it _lies_ about fsync(). Any
FS should return from fsync only after data is on disc, journal or not
(there might be issues with meta-data, but it's hardly a problem with PG).

It's more likely the hardware (IDE disks) lies about data being on plate.
But again that's only in case of sudden poweroffs.

[...]
this condition. I'd be really surprised if XFS is the problem as I know there 
are plenty of other people across the world using it reliability with PG.

This is kind of OT, but I don't follow your logic here.

I don't see why plenty of success stories of XFS+PG suggest to you
the culprit is PG. To me it's still 50% - 50%. :-)

Moreover, XFS is continuosly updated (as it follows normal linux kernel
fast release cycle, like any other linux FS), so it's hard to make a
data point unless someone else is using _exactly_ the same versions as
you do.

For example, in kernel changelog from 2.6.7 to 2.6.10 you can read:

"[XFS] Fix a race condition in the undo-delayed-write buffer routine."

"[XFS] Fix up memory allocators to be more resilient."

"[XFS] Fix a possible data loss issue after an unaligned unwritten
 extent write."

"[XFS] handle inode creating race"

(only a few of them)

Now, I don't have even the faintest idea if that might have affected you
or nor, but still the point is that the linux kernel changes a lot.
And vendors tend to customize their kernels a lot, too. On the PostreSQL
side, releases are slowly-paced, so it's easier.

Anyway, I agree your problem is weird, and that it must be something
on the server side.
No matter what you do on the client side (pool manager, JDBC driver,
servlets engige), in no way the DB should get corrupted with duplicated
primary keys.

I know this is a silly question, but when you write 'We do nothing with
any indexes' do you mean indeces are never, _never_ touched (I mean
explicitly, as in drop/create index), i.e. they are created at schema
creation time and then left alone? Just to make sure...

.TM.
--
      ____/  ____/   /
     /      /       /			Marco Colombo
    ___/  ___  /   /		      Technical Manager
   /          /   /			 ESI s.r.l.
 _____/ _____/  _/		       Colombo@xxxxxx

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
     subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
     message can get through to the mailing list cleanly