Search Postgresql Archives

Re: random backend crashes - how to debug ( Is crash dump handler released ? )

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/14/2011 10:26 PM, BangarRaju Vadapalli wrote:
Hi Everybody,

We are using PostGRE 8.4 version and experiencing random backend
crashes. We have enabled logging and are able to see some logging
happening in pg_log directory but not of much use. Here are the logs.

- Examination of the full length logs sent off-list shows these lines leading up to the crash:

(crash1): 2011-06-15 13:55:59 IST postgres epimart ERROR: XX000: could not open relation base/2850136/3344343_vm: A blocking operation was interrupted by a call to WSACancelBlockingCall.^M

(crash2) 2011-06-15 14:22:40 IST postgres epimart ERROR: XX000: could not open relation base/2850136/3352537_fsm: A blocking operation was interrupted by a call to WSACancelBlockingCall.^M

... in both cases followed by:

XX000: cannot abort transaction 19859931, it was already committed

then a Windows runtime message reporting that the backend crashed.

Ideas?



After some off-list conversation, we've established that:

- The crash is part of a batch process where the OP loads data from external sources. It's the 11th stage of a 12 stage process, and cannot be easily separated into a small self-contained test case. If the OP runs just the 11th stage standalone, without having just run the prior stages, the crash does not occur. It only crashes if the whole process is run in one go.

(OP: please confirm that my summary is accurate, as it's condensed from several emails).

- The crash is reproducible on 9.0 . It hasn't yet been reproduced on 9.1 because the OP is having some problems with views on 9.1 that he'll be posting about separately.

- We can't seem to get a crash dump or attach a debugger to get a backtrace. I built a copy of the early version of the crash dump handler before it was integrated into 9.1 so he could load it as a DLL into 8.4, and it works when the backend is intentionally crashed but doesn't capture the crash that's causing the problem.

- I still haven't been able to confirm how the batch process in question works. Does it all run in a single connection with a single transaction? Or is it a multi-connection affair with multiple scripts / programs involved? If Bangar Raju could describe this part in more detail that would be very helpful.

--
Craig Ringer
2011-06-15 14:22:40 IST postgres epiop STATEMENT:  INSERT INTO logged_event(
		event_type,
		logged_application,
		execution_instance_key,
		event_level,
		thread_name,
		event_code,
		source_timestamp,
		database_time,
		num_params,
		param0,
		param1,
		param2,
		param3,
		param4,
		param5,
		param6,
		param7,
		param8,
		param9,
		overflow_bitmap,
		overflow_small,
		overflow_large
	) VALUES (
		$1, 
		'EpiChannel',
		576, 
		$2,
		$3,
		$4,
		$5,
		LOCALTIMESTAMP,
		$6,
		$7,
		$8,
		$9,
		$10,
		$11,
		$12,
		$13,
		$14,
		$15,
		$16,
		$17,
		$18,
		$19
	)
2011-06-15 14:22:40 IST postgres epiop DEBUG:  00000: CommitTransactionCommand
2011-06-15 14:22:40 IST postgres epiop LOCATION:  finish_xact_command, .\src\backend\tcop\postgres.c:2361
2011-06-15 14:22:40 IST postgres epiop DEBUG:  00000: CommitTransaction
2011-06-15 14:22:40 IST postgres epiop LOCATION:  ShowTransactionState, .\src\backend\access\transam\xact.c:4074
2011-06-15 14:22:40 IST postgres epimart DEBUG:  00000: name: unnamed; blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children: 
2011-06-15 14:22:40 IST postgres epimart LOCATION:  ShowTransactionStateRec, .\src\backend\access\transam\xact.c:4112
2011-06-15 14:22:40 IST postgres epiop DEBUG:  00000: name: unnamed; blockState:       STARTED; state: INPROGR, xid/subid/cid: 19860485/1/4, nestlvl: 1, children: 
2011-06-15 14:22:40 IST postgres epiop LOCATION:  ShowTransactionStateRec, .\src\backend\access\transam\xact.c:4112
2011-06-15 14:22:40 IST postgres epimart DEBUG:  00000: CommitTransaction
2011-06-15 14:22:40 IST postgres epimart LOCATION:  ShowTransactionState, .\src\backend\access\transam\xact.c:4074
2011-06-15 14:22:40 IST postgres epimart DEBUG:  00000: name: unnamed; blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children: 
2011-06-15 14:22:40 IST postgres epimart LOCATION:  ShowTransactionStateRec, .\src\backend\access\transam\xact.c:4112
2011-06-15 14:22:40 IST postgres epimart DEBUG:  00000: name: unnamed; blockState:       STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children: 
2011-06-15 14:22:40 IST postgres epimart LOCATION:  ShowTransactionStateRec, .\src\backend\access\transam\xact.c:4112
2011-06-15 14:22:40 IST postgres epimart DEBUG:  00000: CommitTransaction
2011-06-15 14:22:40 IST postgres epimart LOCATION:  ShowTransactionState, .\src\backend\access\transam\xact.c:4074
2011-06-15 14:22:40 IST postgres epimart DEBUG:  00000: ProcessCatchupEvent outside transaction
2011-06-15 14:22:40 IST postgres epimart LOCATION:  ProcessCatchupEvent, .\src\backend\storage\ipc\sinval.c:327
2011-06-15 14:22:40 IST postgres epimart DEBUG:  00000: name: unnamed; blockState:       STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children: 
2011-06-15 14:22:40 IST postgres epimart LOCATION:  ShowTransactionStateRec, .\src\backend\access\transam\xact.c:4112
2011-06-15 14:22:40 IST postgres epimart ERROR:  XX000: could not open relation base/2850136/3352537_fsm: A blocking operation was interrupted by a call to WSACancelBlockingCall.

	
2011-06-15 14:22:40 IST postgres epimart LOCATION:  mdopen, .\src\backend\storage\smgr\md.c:526
2011-06-15 14:22:40 IST postgres epimart STATEMENT:  COMMIT
2011-06-15 14:22:40 IST postgres epimart WARNING:  01000: AbortTransaction while in COMMIT state
2011-06-15 14:22:40 IST postgres epimart LOCATION:  AbortTransaction, .\src\backend\access\transam\xact.c:2011
2011-06-15 14:22:40 IST postgres epimart PANIC:  XX000: cannot abort transaction 19859931, it was already committed
2011-06-15 14:22:40 IST postgres epimart LOCATION:  RecordTransactionAbort, .\src\backend\access\transam\xact.c:1200


This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

2011-06-15 13:55:59 IST postgres epiop LOG:  00000: duration: 0.000 ms  execute <unnamed>: INSERT INTO logged_event(
		event_type,
		logged_application,
		execution_instance_key,
		event_level,
		thread_name,
		event_code,
		source_timestamp,
		database_time,
		num_params,
		param0,
		param1,
		param2,
		param3,
		param4,
		param5,
		param6,
		param7,
		param8,
		param9,
		overflow_bitmap,
		overflow_small,
		overflow_large
	) VALUES (
		$1, 
		'EpiChannel',
		543, 
		$2,
		$3,
		$4,
		$5,
		LOCALTIMESTAMP,
		$6,
		$7,
		$8,
		$9,
		$10,
		$11,
		$12,
		$13,
		$14,
		$15,
		$16,
		$17,
		$18,
		$19
	)
2011-06-15 13:55:59 IST postgres epiop DETAIL:  parameters: $1 = 'INFO', $2 = '4', $3 = 'Job Execution Thread', $4 = 'SQL.ExplainOutput.NOWRAP', $5 = '1308126359286', $6 = '10', $7 = 'inhydbvadapall1', $8 = 'epimart', $9 = NULL, $10 = 'N/A', $11 = 'N/A', $12 = 'N/A', $13 = 'N/A', $14 = 'N/A', $15 = 'N/A', $16 = '@ 6/15/11 1:55:59 PM', $17 = '4', $18 = 'BEGIN EXPLAIN PLAN
	|QUERY PLAN
	|Result  (cost=0.00..0.26 rows=1 width=0)
	|  Output: epi_createindex(''CREATE  UNIQUE   INDEX  CustomerFMap_A_PK  ON  CustomerFMap_A  (  child_real_key  )    ''::character varying, '' CustomerFMap_A_PK ''::character varying)
	END EXPLAIN PLAN', $19 = NULL
2011-06-15 13:55:59 IST postgres epiop LOCATION:  exec_execute_message, .\src\backend\tcop\postgres.c:1988
2011-06-15 13:55:59 IST postgres epimart DEBUG:  00000: sinval catchup complete, cleaning queue
2011-06-15 13:55:59 IST postgres epimart LOCATION:  ReceiveSharedInvalidMessages, .\src\backend\storage\ipc\sinval.c:141
2011-06-15 13:55:59 IST postgres epimart DEBUG:  00000: name: unnamed; blockState:       STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children: 
2011-06-15 13:55:59 IST postgres epimart LOCATION:  ShowTransactionStateRec, .\src\backend\access\transam\xact.c:4112
2011-06-15 13:55:59 IST postgres epimeta DEBUG:  00000: name: unnamed; blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children: 
2011-06-15 13:55:59 IST postgres epimeta LOCATION:  ShowTransactionStateRec, .\src\backend\access\transam\xact.c:4112
2011-06-15 13:55:59 IST postgres epimart ERROR:  XX000: could not open relation base/2850136/3344343_vm: A blocking operation was interrupted by a call to WSACancelBlockingCall.

	
2011-06-15 13:55:59 IST postgres epimart LOCATION:  mdopen, .\src\backend\storage\smgr\md.c:526
2011-06-15 13:55:59 IST postgres epimart STATEMENT:  COMMIT
2011-06-15 13:55:59 IST postgres epimeta DEBUG:  00000: CommitTransaction
2011-06-15 13:55:59 IST postgres epimeta LOCATION:  ShowTransactionState, .\src\backend\access\transam\xact.c:4074
2011-06-15 13:55:59 IST postgres epimart WARNING:  01000: AbortTransaction while in COMMIT state
2011-06-15 13:55:59 IST postgres epimart LOCATION:  AbortTransaction, .\src\backend\access\transam\xact.c:2011
2011-06-15 13:55:59 IST postgres epimart DEBUG:  00000: StartTransaction
2011-06-15 13:55:59 IST postgres epimart LOCATION:  ShowTransactionState, .\src\backend\access\transam\xact.c:4074
2011-06-15 13:55:59 IST postgres epimart PANIC:  XX000: cannot abort transaction 19773146, it was already committed
2011-06-15 13:55:59 IST postgres epimart LOCATION:  RecordTransactionAbort, .\src\backend\access\transam\xact.c:1200
2011-06-15 13:55:59 IST postgres epimart DEBUG:  00000: name: unnamed; blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children: 
2011-06-15 13:55:59 IST postgres epimart LOCATION:  ShowTransactionStateRec, .\src\backend\access\transam\xact.c:4112
2011-06-15 13:55:59 IST postgres epiop DEBUG:  00000: sinval catchup complete, cleaning queue
2011-06-15 13:55:59 IST postgres epiop LOCATION:  ReceiveSharedInvalidMessages, .\src\backend\storage\ipc\sinval.c:141


This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux