Search Postgresql Archives

Re: Lots of stuck queries after upgrade to 9.4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/28/2015 11:36 PM, Heikki Linnakangas wrote:
A-ha, I succeeded to reproduce this now on my laptop, with pgbench! It
seems to be important to have a very large number of connections:

pgbench -n -c400 -j4 -T600 -P5

That got stuck after a few minutes. I'm using commit_delay=100.

Now that I have something to work with, I'll investigate this more tomorrow.

Ok, it seems that this is caused by the same issue that I found with my synthetic test case, after all. It is possible to get a lockup because of it.

For the archives, here's a hopefully easier-to-understand explanation of how the lockup happens. It involves three backends. A and C are insertion WAL records, while B is flushing the WAL with commit_delay. The byte positions 2000, 2100, 2200, and 2300 are offsets within a WAL page. 2000 points to the beginning of the page, while the others are later positions on the same page. WaitToFinish() is an abbreviation for WaitXLogInsertionsToFinish(). "Update pos X" means a call to WALInsertLockUpdateInsertingAt(X). "Reserve A-B" means a call to ReserveXLogInsertLocation, which returned StartPos A and EndPos B.

Backend A		Backend B		Backend C
---------		---------		---------
Acquire InsertLock 2
Reserve 2100-2200
			Calls WaitToFinish()
			  reservedUpto is 2200
			  sees that Lock 1 is
			  free
						Acquire InsertLock 1
						Reserve 2200-2300
						GetXLogBuffer(2200)
						 page not in cache
						 Update pos 2000
						 AdvanceXLInsertBuffer()
						  run until about to
						  acquire WALWriteLock
GetXLogBuffer(2100)
 page not in cache
 Update pos 2000
 AdvanceXLInsertBuffer()
  Acquire WALWriteLock
  write out old page
  initialize new page
  Release WALWriteLock
finishes insertion
release InsertLock 2
			WaitToFinish() continues
			  sees that lock 2 is
			  free. Returns 2200.

			Acquire WALWriteLock
			Call WaitToFinish(2200)
			  blocks on Lock 1,
			  whose initializedUpto
			  is 2000.

At this point, there is a deadlock between B and C. B is waiting for C to release the lock or update its insertingAt value past 2200, while C is waiting for WALInsertLock, held by B.

To fix that, let's fix GetXLogBuffer() to always advertise the exact position, not the beginning of the page (except when inserting the first record on the page, just after the page header, see comments).

This fixes the problem for me. I've been running pgbench for about 30 minutes without lockups now, while without the patch it locked up within a couple of minutes. Spiros, can you easily test this patch in your environment? Would be nice to get a confirmation that this fixes the problem for you too.

- Heikki

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8e9754c..307a04c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1839,11 +1839,32 @@ GetXLogBuffer(XLogRecPtr ptr)
 	endptr = XLogCtl->xlblocks[idx];
 	if (expectedEndPtr != endptr)
 	{
+		XLogRecPtr	initializedUpto;
+
 		/*
-		 * Let others know that we're finished inserting the record up to the
-		 * page boundary.
+		 * Before calling AdvanceXLInsertBuffer(), which can block, let others
+		 * know how far we're finished with inserting the record.
+		 *
+		 * NB: If 'ptr' points to just after the page header, advertise a
+		 * position at the beginning of the page rather than 'ptr' itself. If
+		 * there are no other insertions running, someone might try to flush
+		 * up to our advertised location. If we advertised a position after
+		 * the page header, someone might try to flush the page header, even
+		 * though page might actually not be initialized yet. As the first
+		 * inserter on the page, we are effectively responsible for making
+		 * sure that it's initialized, before we let insertingAt to move past
+		 * the page header.
 		 */
-		WALInsertLockUpdateInsertingAt(expectedEndPtr - XLOG_BLCKSZ);
+		if (ptr % XLOG_BLCKSZ == SizeOfXLogShortPHD &&
+			ptr % XLOG_SEG_SIZE > XLOG_BLCKSZ)
+			initializedUpto = ptr - SizeOfXLogShortPHD;
+		else if (ptr % XLOG_BLCKSZ == SizeOfXLogLongPHD &&
+				 ptr % XLOG_SEG_SIZE < XLOG_BLCKSZ)
+			initializedUpto = ptr - SizeOfXLogLongPHD;
+		else
+			initializedUpto = ptr;
+
+		WALInsertLockUpdateInsertingAt(initializedUpto);
 
 		AdvanceXLInsertBuffer(ptr, false);
 		endptr = XLogCtl->xlblocks[idx];
-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux