Tom and Kevin-
There were two entries in pg_prepared_xacts. In the test-bed, executing the 'ROLLBACK PREPARED' on both allowed the system to continue processing. All locks I saw in 'pg_locks' where the virtualtransaction started with the '-1/' were also gone. That was indeed the issue. More importantly to me, there was no issue likely leftover during our 9.1.4->9.1.6 upgrade we did; just a 'flaky transaction manager' as you suspected.
Thanks to both of you for help in tracking this down.
P.S. Kevin, We also do tar archives of the data directories nightly to accompany the wal files we store, using pg_start_backup/pg_stop_backup. :-) Full restores are tested monthly.
On Thu, Feb 21, 2013 at 2:06 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Ned Wolpert <ned.wolpert@xxxxxxxxxxxxx> writes:I'm betting one of those prepared transactions had updated or deleted
> Event: Running 9.1.6 with hot-standby, archiving 4 months of wal files,
> and even a nightly pg_dump all. 50G database. Trying to update or delete a
> row in a small (21 row, but heavily used table) would lock up completely.
> Never finish. Removed all clients, restarted the db instance, no joy. Check
> pg_stat_activity, and nothing that wasn't idle.... run the delete, locked
> up.
this row, and thus held a row lock on it. (Come to think of it, a
SELECT FOR UPDATE/SHARE might have been enough.) Did you try committing
or rolling back those xacts?
regards, tom lane
Virtually, Ned Wolpert
"Settle thy studies, Faustus, and begin..." --Marlowe