Search Postgresql Archives

Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 1, 2015 at 4:55 PM, Noah Misch <noah@xxxxxxxxxxxx> wrote:
> While testing this (with inconsistent-multixact-fix-master.patch applied,
> FWIW), I noticed a nearby bug with a similar symptom.  TruncateMultiXact()
> omits the nextMXact==oldestMXact special case found in each other
> find_multixact_start() caller, so it reads the offset of a not-yet-created
> MultiXactId.  The usual outcome is to get rangeStart==0, so we truncate less
> than we could.  This can't make us truncate excessively, because
> nextMXact==oldestMXact implies no table contains any mxid.  If nextMXact
> happens to be the first of a segment, an error is possible.  Procedure:
>
> 1. Make a fresh cluster.
> 2. UPDATE pg_database SET datallowconn = true
> 3. Consume precisely 131071 mxids.  Number of offsets per mxid is unimportant.
> 4. vacuumdb --freeze --all
>
> Expected state after those steps:
> $ pg_controldata | grep NextMultiXactId
> Latest checkpoint's NextMultiXactId:  131072
>
> Checkpoint will fail like this:
> 26699 2015-05-31 17:22:33.134 GMT LOG:  statement: checkpoint
> 26661 2015-05-31 17:22:33.134 GMT DEBUG:  performing replication slot checkpoint
> 26661 2015-05-31 17:22:33.136 GMT ERROR:  could not access status of transaction 131072
> 26661 2015-05-31 17:22:33.136 GMT DETAIL:  Could not open file "pg_multixact/offsets/0002": No such file or directory.
> 26699 2015-05-31 17:22:33.234 GMT ERROR:  checkpoint request failed
> 26699 2015-05-31 17:22:33.234 GMT HINT:  Consult recent messages in the server log for details.
> 26699 2015-05-31 17:22:33.234 GMT STATEMENT:  checkpoint
>
> This does not block startup, and creating one mxid hides the problem again.
> Thus, it is not a top-priority bug like some other parts of this thread.  I
> mention it today mostly so it doesn't surprise hackers testing other fixes.

Thanks.   As mentioned elsewhere in the thread, I discovered that the
same problem exists for page boundaries, with a different error
message.  I've tried the attached repro scripts on 9.3.0, 9.3.5, 9.4.1
and master with the same results:

FATAL:  could not access status of transaction 2048
DETAIL:  Could not read from file "pg_multixact/offsets/0000" at
offset 8192: Undefined error: 0.

FATAL:  could not access status of transaction 131072
DETAIL:  Could not open file "pg_multixact/offsets/0002": No such file
or directory.

But, yeah, this isn't the bug we're looking for.

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment: checkpoint-page-boundary.sh
Description: Bourne shell script

Attachment: checkpoint-segment-boundary.sh
Description: Bourne shell script

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux