Re: Index Corruption

pasman pasmański <pasman.p@xxxxxxxxx> · Mon, 12 Sep 2011 16:50:59 +0200

Hi. Do you have triggers on corrupted tables?

2011/9/12, Dylan Adams <dylan.adams.work@xxxxxxxxx>:
> We moved to PostgreSQL about 2 years ago and have been very happy with it
> overall. The only major issue that we've had is intermittent index
> corruption.
> This manifests itself as either "duplicate key value violates unique
> constraint"
> or "could not read block 37422 of relation 1663/18663/19063: read only 0 of
> 8192
> bytes." REINDEXing the table solves the problem. We do sometimes have bugs
> that
> cause unique index violations, so the first error is especially annoying.
>
> We've encountered the problem with both JDBC and libpq clients. The problem
> has
> persisted through upgrades to our database servers (from 32bit CentOS 5.3
> with
> PostgreSQL 8.3.9 to 64bit CentOS 5.6 with PostgreSQL 8.4.8, all with stock
> kernels).
>
> Our database servers are fully virtualized, running under VMware Server on
> Dell
> PowerEdge Servers. We use battery backed raid controllers (PERC4/5/6),
> configured for RAID 10. We also experienced the problem when we had physical
> database servers. Servers all have 2 CPUs.
>
> PostgreSQL is installed from the PGDG RPMs (yum.postgresql.org). Our only
> non-default postgresql.conf options are the typical tunables:
> shared_buffers,
> checkpoint_segments, effective_cache_size, default_statistics_target, etc.
> WAL
> settings are defaults (save checkpoint_segments). No clustering, no addons.
> We've tuned autovacuum to be more aggressive in an attempt to address some
> bloat
> issues; this didn't seem to have any impact on the frequency of index
> corruption.
>
> The databases are unloaded nightly. We do a weekly pg_dumpall as an
> integrity
> check.
>
> We've only seen the problem in our many weekly batch processes. Typically,
> these
> processes DELETE a large subset of data from a table and then repopulate
> with
> the same values in indexed fields. Depending on the dataset, they run from
> 30
> minutes to 8 hours. These processes are scheduled so that only one program
> (which is single threaded) is updating any given table at a time.
>
> We have 12 database servers and we usually have about 1 incident per week.
> Sometimes we'll go for weeks without any occurrences, and then we'll have a
> flurry of them.
>
> My primary question: is this normal? There isn't an overwhelming amount of
> messages in the archives about index corruption, which leads me to think
> that
> there's something with our configuration or our processes that is making us
> more
> susceptible.
>
> Is there something we should be doing to make index corruption less likely?
> Is
> there anyway to do an index integrity check so we can be more proactive with
> REINDEXing?
>
> Thanks,
> dylan
>
> --
> Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

-- 
------------
pasman

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general