Sorry for a reply to myself, but does anyone have any idea what could be the problem? We would like to try do some testing from your suggestions to see what could cause this problem and how to mitigate it.
Regards,
Strahinja
On Fri, Nov 15, 2013 at 11:44 AM, Strahinja Kustudić <strahinjak@xxxxxxxxxxx> wrote:
Servers are Dell PowerEdge R420 with H700 RAID controller with BBU, Centos 5.9 x64 with Postgres 9.1.9 running on two Intel 330 120GB SSDSC2CT120 (one for data, and one for indexes) on XFS (noatime,nobarrier,noquota). Relevant Postgres configuration is:Hi all,Last week we migrated 200+ of our servers from one rack to another and the procedure was dead simple: power off server from the OS, unplug it, move it to a different rack, plug it in and start it. The problem was that after the boot some of the servers had corrupted indexes.
wal_level = minimal
fsync = on
wal_sync_method = fdatasync
full_page_writes = on
synchronous_commit = off
wal_buffers = -1Also we disabled disk write cache on all drives with the MegaCli64 utility, since the RAID controller should be the one caching since it has a BBU.
Does anyone have any idea, why could we get index corruption?Thanks in advanceRegards,
Strahinja