On Mon, Sep 21, 2009 at 12:46 PM, Tom Duffey <tduffey@xxxxxxxxxxxxxxxx> wrote: > > On Sep 21, 2009, at 12:40 PM, Scott Marlowe wrote: > >> On Mon, Sep 21, 2009 at 11:09 AM, Tom Duffey <tduffey@xxxxxxxxxxxxxxxx> >> wrote: >>> >>> Hi All, >>> >>> We're having numerous problems with a PostgreSQL 8.3.7 database running >>> on a >>> virtual Linux server w/VMWare ESX. This is not by choice and I have been >>> asking the operator of this equipment for details about the disk setup >>> and >>> here's what I got: >>> >>> "We have a SAN that is presenting an NFS share. VMWare sees that share >>> and >>> reads the VMDK file that make up the virtual file system." >>> >>> Does anyone with a better understanding of PostgreSQL and VMWare know if >>> this is an unreliable setup for PostgreSQL? I see things like "NFS" and >>> "VMWare" and start to get worried. >> >> I see VMWare and thing performance issues, I see NFS and thing dear >> god help us all. Even if properly setup NFS is a problem waiting to >> happen, and it's not reliable storage for a database in my opinion. >> That said, lots of folks do it. Ask for the NFS mount options from >> the sysadmin. > > Thanks to everyone so far for the insight. I'm trying to get more details > about the hardware setup but am not making much progress. > > Here are some of the errors we're getting. I searched through archives and > they all seem to point at hardware trouble but is there anything else I > should be looking at? > > ERROR: invalid page header in block 2 of relation "pg_toast_19466_index" > > ERROR: invalid memory alloc request size 1667592311 > STATEMENT: COPY public.version_bundle (node_id_hi, node_id_lo, bundle_data) > TO stdout; > > ERROR: unexpected chunk number 1632 (expected 1629) for toast value 19711 > in pg_toast_19184 > STATEMENT: COPY public.data_binval (binval_id, binval_data) TO stdout; > > ERROR: invalid page header in block 414 of relation "pg_toast_19460_index" > > ERROR: could not open segment 1 of relation 1663/16386/16535 (target block > 3966127611): No such file or directory > > I dealt with some of the above by reindexing or finding and deleting bad > rows. I can now successfully dump the database but of course have missing > data so the application is toast. What I'm really wondering now is how to > prevent this from happening again and if that means moving the database to > new hardware. Definitely sounds like file system corruption to me. And who knows what's gotten hammered that hasn't caused an error, eh? Time to move to a standalone db server or get a sysadmin who knows how to setup vmware to make pgsql happy. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general