Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 01, 2011 at 08:40:37AM -0500, Robert Haas wrote:
> On Mon, Feb 28, 2011 at 10:32 PM, Greg Stark <gsstark@xxxxxxx> wrote:
> > On Tue, Mar 1, 2011 at 1:43 AM, David Christensen <david@xxxxxxxxxxxx> wrote:
> >> Was this cluster upgraded to 8.4.4 from 8.4.0?  It sounds to me like a known bug in 8.4.0 which was fixed by this commit:
> >>
> >
> > The reproduction script described was running vacuum repeatedly. A
> > single vacuum run out to be sufficient to clean up the problem if it
> > was left-over.
> >
> > I wonder if it would help to write a regression test that runs 100 or
> > so vacuums and see if the bulid farm turns up any examples of this
> > behaviour.
> 
> One other thing to keep in mind here is that the warning message we've
> chosen can be a bit misleading.  The warning is:
> 
> WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation "test" page 1
> 
> ...which implies that the state of the tuples is correct, and that the
> page-level bit is wrong in comparison.  But I recently saw a case
> where the infomask got clobbered, resulting in this warning.  The page
> level bit was correct, at least relative to the intended page
> contents; it was the a tuple on the page that was screwed up.  It
> might have been better to pick a more neutral phrasing, like "page is
> marked all-visible but some tuples are not visible".

Yeesh. Yikes. I hope that this is not the case as we are seeing thousands of
these daily on each of 4 large production hosts. Mostly on catalogs,
especially pg_statistic. However it does occur on some high delete/insert
traffic user tables too.

Question: what would be the consequence of simply patching out the setting
of this flag? Assuming that the incorrect PD_ALL_VISIBLE flag is the only
problem (big assumption perhaps) then simply never setting it would at least
avoid the possibility of returning wrong answers, presumably at some
performance cost. We possibly could live with that until we get a handle
on the real cause and fix.

I had a look and don't really see anything except vacuum_lazy that sets it,
so it seems simple to disable.

Or have I understood this incorrectly?

Anything else I can be doing to try to track this down?

-dg

-- 
David Gould       daveg@xxxxxxxxx      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux