Actually, scratch that. The difference in behaviour seems to be on the optimiser which now table scans the 10 row table (which I guess it should always have done really) rather than use the index as it was in 9.1.6.
The same index corruption occurs, so the same reproduction case stands, it just needs a "set enable_seqscan=false" prior to running the selects on the slave.
Cheers,
James
From: James Cowell <jcowell@xxxxxxxxxxxxxx>
To: Jeff Janes <jeff.janes@xxxxxxxxx>
Cc: "pgsql-general@xxxxxxxxxxxxxx" <pgsql-general@xxxxxxxxxxxxxx>
Sent: Monday, 17 December 2012, 20:42
Subject: Re: [GENERAL] Corrupt indexes on slave when using pg_bulkload on master
To: Jeff Janes <jeff.janes@xxxxxxxxx>
Cc: "pgsql-general@xxxxxxxxxxxxxx" <pgsql-general@xxxxxxxxxxxxxx>
Sent: Monday, 17 December 2012, 20:42
Subject: Re: [GENERAL] Corrupt indexes on slave when using pg_bulkload on master
I saw that 9.1.7 was out and it
had another bugfix in for WAL playback.
I installed it and ran my reproduction case and it no longer corrupts the index.
I reindexed the database I load into and did a full data load and the indexes still corrupt on the slave.
It does not appear to be related to constraint violation as one table is new rows only.
I will try and put together a new reproduction case this week.
Cheers,
James
From: James Cowell <jcowell@xxxxxxxxxxxxxx>
To: Jeff Janes <jeff.janes@xxxxxxxxx>
Cc: "pgsql-general@xxxxxxxxxxxxxx" <pgsql-general@xxxxxxxxxxxxxx>
Sent: Thursday, 13 December 2012, 12:26
Subject: Re: [GENERAL] Corrupt indexes on slave when using pg_bulkload on master
To: Jeff Janes <jeff.janes@xxxxxxxxx>
Cc: "pgsql-general@xxxxxxxxxxxxxx" <pgsql-general@xxxxxxxxxxxxxx>
Sent: Thursday, 13 December 2012, 12:26
Subject: Re: [GENERAL] Corrupt indexes on slave when using pg_bulkload on master
Hi Jeff,
Thanks again for your reply.
>If there are no constraint
violations, do you still see the problem?
Yes, I've stripped it down to an empty table with a 10 row load and the pk index on the secondary node still corrupts.
> Were there any older version on which it worked?
I'm afraid I started on 9.1.5, I upgraded to 9.1.6 when I had the initial problem due to the bugfix in the changelog to do with corrupt indexes on the secondary node but it hasn't resolved the issue.
> Can
you post a minimal schema and control file to reproduce the problem?
I've attached a text file with details for table, load config file etc, is that everything you would need?
Cheers,
James
From: Jeff Janes <jeff.janes@xxxxxxxxx>
To: James Cowell <jcowell@xxxxxxxxxxxxxx>
Cc: "pgsql-general@xxxxxxxxxxxxxx" <pgsql-general@xxxxxxxxxxxxxx>
Sent: Monday, 10 December 2012, 16:53
Subject: Re: Corrupt indexes on slave when using pg_bulkload on master
To: James Cowell <jcowell@xxxxxxxxxxxxxx>
Cc: "pgsql-general@xxxxxxxxxxxxxx" <pgsql-general@xxxxxxxxxxxxxx>
Sent: Monday, 10 December 2012, 16:53
Subject: Re: Corrupt indexes on slave when using pg_bulkload on master
On Wed, Dec 5, 2012 at 5:17 AM, James Cowell <jcowell@xxxxxxxxxxxxxx> wrote:
> I'm using pg_bulkload to load large amounts of CSV data into a postgres
> database hourly.
>
> This database is replicated to a second node.
>
> Whenever a bulk load happens the indexes on the updated tables on the
> secondary node corrupt and are unusable until a reindex is run on the
> primary node. I get the error below on node 2:
>
> ERROR: index "tablename" contains unexpected zero page at block 0
> SQL state: XX002
> Hint: Please REINDEX it.
>
> I'm assuming that this is because of the way pg_bulkload builds the index on
> the primary, and possibly has something to do with the way pg_bulkload
> overwrites rows in the event of a constraint violation,
If there are no constraint violations, do you still see the problem?
> but at the same time
> if something works on the primary shouldn't the replicated node be able to
> process the WAL log?
>
> I've tried this on 9.1.6 and 9.1.5 on RHEL 6.3 with pg_bulkload build
> 3.1.1-1.pg91.rhel6 and it happens every time.
Were there any older version on which it worked? Can you post a
minimal schema and control file to reproduce the problem?
Cheers,
Jeff