Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

Robert Haas <robertmhaas@xxxxxxxxx> · Tue, 2 Jun 2015 11:37:02 -0400

On Tue, Jun 2, 2015 at 11:27 AM, Andres Freund <andres@xxxxxxxxxxx> wrote:
> On 2015-06-02 11:16:22 -0400, Robert Haas wrote:
>> I'm having trouble figuring out what to do about this.  I mean, the
>> essential principle of this patch is that if we can't count on
>> relminmxid, datminmxid, or the control file to be accurate, we can at
>> least look at what is present on the disk.  If we also cannot count on
>> that to be accurate, we are left without any reliable source of
>> information.  Consider a hypothetical cluster where all our stored
>> minmxids of whatever form are corrupted (say, all change to 1) and in
>> addition there are stray files in pg_multixact.  I don't think there's
>> really any way to get ourselves out of trouble in that scenario.
>
> If we were to truncate after vacuum, and only on the primary (via WAL
> logging), we could, afaics, just rely on all the values to be
> recomputed. I mean relminmxid will be recomputed after a vacuum, and
> thus, after some time, will datminmxid and the control file value.  We
> could just force a value of 1 to always trigger anti-wraparound vacuums
> (or wait for that to happen implicitly, to delay the impact?). That'll
> then should then fix the problem in a relatively short amount of time?

The exact circumstances under which we're willing to replace a
relminmxid with a newly-computed one that differs are not altogether
clear to me, but there's an "if" statement protecting that logic, so
there are some circumstances in which we'll leave the existing value
intact.  If we force non-stop vacuuming in that scenario, autovacuum
will just run like crazy without accomplishing anything, which
wouldn't be good.  It would similarly do so when the oldest MXID
reference in the relation is in fact 1, but that value can't be
vacuumed away yet.

Also, the database might be really big.  Even if it were true that a
full scan of every table would get us out of this state, describing
the time that it would take to do that as "relatively short" seems to
me to be considerably understating the impact of a full-cluster
VACUUM.

With regard to the more general question of WAL-logging this, are you
going to work on that?  Are you hoping Alvaro or I will work on that?
Should we draw straws?  It seems like somebody needs to do it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general