With your corrupted index theory, would simply rebuilding the index “fix” the issue? Is there any other way to diagnose if this is indeed the cause?
Thanks for your help,
Erik Peterson
On 7/17/07 10:54 AM, "Tom Lane" <tgl@xxxxxxxxxxxxx> wrote:
"Erik Peterson" <epeterson@xxxxxxxxxxxxxx> writes:
> I'm having this issue where once or twice per day (out of ~100,000 =
> queries)
> the table doesn't reflect a committed update immediately. Usually when =
> this
> problem occurs the update takes 1-3 minutes to be reflected in SELECT
> queries. Occasionally, it has taken more than 10 minutes.
> The session could go something like this:
> UPDATE mytable SET myattribute=1 WHERE id=14;
> COMMIT;
> SELECT myattribute FROM mytable WHERE id=14;
> (Query returns myattribute with a value of 0)
> (Wait 5 minutes)
> SELECT myattribute FROM mytable WHERE id=14;
> (Query returns myattribute with a value of 1)
To be blunt, I don't believe it. I can think of bugs by which a commit
might be lost entirely, but there is no mechanism that would make it
good five minutes later. I think you've misdiagnosed your problem
somehow --- either you're not really committing where you think you are,
or the observing query is using an old snapshot (maybe you are running
it in a serializable transaction?)
A somewhat more credible theory would revolve around corrupted indexes.
If there's a corrupted index on "id" in the above example, a query might
sometimes find one version of a row and sometimes find another; although
any given search would be deterministic, apparently-unrelated changes in
the index contents could change which one is found.
Have you actually been able to reproduce a problem as sketched above in
a single session --- ie, the update and the contradictory observations
all done by the same backend? Or is this a representation of things
that are happening in different sessions? What else is going on
meanwhile?
regards, tom lane