Search Postgresql Archives

Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Steve Kehlet wrote:
> I have a database that was upgraded from 9.4.1 to 9.4.2 (no pg_upgrade, we
> just dropped new binaries in place) but it wouldn't start up. I found this
> in the logs:
> 
> waiting for server to start....2015-05-27 13:13:00 PDT [27341]: [1-1] LOG:
>  database system was shut down at 2015-05-27 13:12:55 PDT
> 2015-05-27 13:13:00 PDT [27342]: [1-1] FATAL:  the database system is
> starting up
> .2015-05-27 13:13:00 PDT [27341]: [2-1] FATAL:  could not access status of
> transaction 1

I am debugging today a problem currently that looks very similar to
this.  AFAICT the problem is that WAL replay of an online checkpoint in
which multixact files are removed fails because replay tries to read a
file that has already been removed.

(I was nervous about removing the check to omit reading pg_multixact
files while on recovery.  Looks like my hunch was right, though the
actual problem is not what I was fearing.)

I think the fix to this is to verify whether the file exists on disk
before reading it; if it doesn't, assume the truncation has already
happened and that it's not necessary to remove it.

> I found [this report from a couple days ago](
> https://bugs.archlinux.org/task/45071) from someone else that looks like
> the same problem.

Right :-(

I think a patch like this should be able to fix it ... not tested yet.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 9568ff1..bb8cbd7 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -2208,6 +2208,12 @@ SetMultiXactIdLimit(MultiXactId oldest_datminmxid, Oid oldest_datoid)
 	 * to one.  It will instead point to the multixact ID that will be
 	 * assigned the next time one is needed.
 	 *
+	 * Note that when this is called during xlog replay, the required files
+	 * might have already been removed, and it would be an error to try to read
+	 * them.  To work around this, we test the file for existance before trying
+	 * to read it; if the file doesn't exist, we just don't read it.  We trust
+	 * that a further call to this routine later will set things straight.
+	 *
 	 * NB: oldest_dataminmxid is the oldest multixact that might still be
 	 * referenced from a table, unlike in DetermineSafeOldestOffset, where we
 	 * do this same computation based on the oldest value that might still
@@ -2217,16 +2223,24 @@ SetMultiXactIdLimit(MultiXactId oldest_datminmxid, Oid oldest_datoid)
 	 * new multixacts, which requires the old ones to have first been
 	 * truncated away by a checkpoint.
 	 */
-	LWLockAcquire(MultiXactGenLock, LW_SHARED);
-	if (MultiXactState->nextMXact == oldest_datminmxid)
-	{
-		oldestOffset = MultiXactState->nextOffset;
-		LWLockRelease(MultiXactGenLock);
-	}
-	else
 	{
+		MultiXactId	nextMulti;
+		MultiXactOffset nextOffset;
+		int			pageno;
+
+		/* grab data that requires lock first */
+		LWLockAcquire(MultiXactGenLock, LW_SHARED);
+		nextMulti = MultiXactState->nextMXact;
+		nextOffset = MultiXactState->nextOffset;
 		LWLockRelease(MultiXactGenLock);
-		oldestOffset = find_multixact_start(oldest_datminmxid);
+
+		pageno = MultiXactIdToOffsetPage(oldest_datminmxid);
+
+		if ((nextMulti != oldest_datminmxid) &&
+			(!InRecovery || SimpleLruDoesPhysicalPageExist(pageno)))
+			oldestOffset = find_multixact_start(oldest_datminmxid);
+		else
+			oldestOffset = nextOffset;
 	}
 
 	/* Grab lock for just long enough to set the new limit values */
-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux