Search Postgresql Archives

locked backend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I've had today a strange lockup on our postgres data base.

Postgres 8.0.3
Debian GNU/Linux 3.1
autovacuum set up

The situation (diagnosed via pg_stat_activity): one table was locked by
an update, a VACUUM ANALYZE was running for the same table (triggered by
autovacuum), and a handful of inserts were waiting on the same table.
After some checking around (see below for details), I decided to kill
the UPDATE (it was holding up the inserts which means lost data for us
and danger of connection starvation), and did it with "kill processid"
from the command line, but no success: the backend didn't respond. Then
I killed the VACUUM which exited promptly, the UPDATE was still blocked.
Then I remembered there is a pg_cancel_backend function (seen it
recently on this list), and tried that too, without success - the update
was still blocked, and blocking all inserts.

In this situation I decided to kill -9 the blocked backend, knowingly
that this will basically restart the data base server, which did happen.
Only it took a veeeery long time to come up again (~15-20 minutes),
which I also find very strange...

Now my actual question is: what can cause a backend to block so that it
won't answer to kill or pg_cancel_backend ? Is there any other solution
than kill -9 in these situations ? I'm pretty sure the update would
never have finished, the box was mostly idle when I did all this, so I
can exclude that the update was busy working. I actually have seen this
a few times before, and had to do the same kill -9 to get back on track.

Some details of the circumstances this happened:

I'm attaching a file with some diagnostic output, together with the SQLs
used to get them. I also checked ps on the command line to make sure the
pg_stat_activity is not reporting phantom SQLs (as I had some occasions
it did). Unfortunately the attached diagnostic is not complete, as I
forgot to save it from the beginning, and some of it rolled out of my
terminal's buffer... particularly, the VACUUM is not visible anymore, I
already killed it by then.

As you see, the update was the oldest query running (and for more than 3
hours too !!!), and when I looked first there was a VACUUM also running,
being the immediate oldest entry after the UPDATE. The inserts came
after it, and checking their locks, they were waiting for a lock on the
"problem_table", I don't remember exactly what kind of lock.

I have no clue as of what's happening here, so I would be grateful for
any help in diagnosing this further...

Maybe I should mention that with the same data base (but a different box
!) I have already seen this behavior 1-2 times, and 2 times the system
locked up completely (the machine was not answering even for ssh, but
then after restart nothing was suspect, an extensive hardware test did
not find anything either).

TIA for any help.

Cheers,
Csaba.

    prepare ps as
    SELECT procpid, substring(current_query for 97), to_char((now()-query_start), 'HH24:MI:SS') as t
    FROM pg_stat_activity
    where current_query not like '%<insufficient%'
      and current_query not like '%IDLE%' order by t desc;

    prepare locks(bigint) as
     select c.relname, l.*
     from pg_locks l left outer join pg_class c on c.oid=l.relation
     where pid=$1
    union all
     select c.relname, l.*
     from pg_locks l left outer join pg_class c on c.oid=l.relation
     where l.pid = (select ml.pid from pg_locks ml, pg_locks cl
                    where cl.pid=$1
                      and not cl.granted
                      and cl.transaction = ml.transaction
                      and ml.mode = 'ExclusiveLock');


dbprod=# execute ps;
 procpid |        substring                      |    t
---------+---------------------------------------+----------
    5239 | UPDATE problem_table SET ... FROM ... | 03:40:54
    ... the VACUUM should be here...
    5248 | insert into problem_table( ...        | 03:37:00
    5251 | insert into problem_table( ...        | 03:26:07
    5235 | insert into problem_table( ...        | 03:25:20
    5331 | insert into problem_table( ...        | 03:24:58
    5266 | insert into problem_table( ...        | 03:24:58
    5262 | insert into problem_table( ...        | 03:24:58
    5271 | insert into problem_table( ...        | 03:24:57
    5282 | insert into problem_table( ...        | 03:24:48
    5328 | insert into problem_table( ...        | 03:24:47
    5304 | insert into problem_table( ...        | 03:24:41
    5354 | insert into problem_table( ...        | 03:24:39
    5243 | insert into problem_table( ...        | 03:24:24
    5358 | insert into problem_table( ...        | 03:23:46
    5260 | insert into problem_table( ...        | 03:21:57
    5273 | insert into problem_table( ...        | 03:21:52
    5256 | insert into problem_table( ...        | 03:21:39
    5496 | insert into problem_table( ...        | 03:21:23
    5238 | insert into problem_table( ...        | 03:21:17
    5321 | insert into problem_table( ...        | 03:19:58
    5287 | insert into problem_table( ...        | 03:19:48
    5320 | insert into problem_table( ...        | 03:19:47
    5348 | insert into problem_table( ...        | 03:19:38
    5359 | insert into problem_table( ...        | 03:18:41
    5301 | insert into problem_table( ...        | 03:18:32
    5498 | insert into problem_table( ...        | 03:16:23
    5319 | insert into problem_table( ...        | 03:14:58
    5284 | insert into problem_table( ...        | 03:14:48
    5322 | insert into problem_table( ...        | 03:14:47
    5349 | insert into problem_table( ...        | 03:14:38
    5362 | insert into problem_table( ...        | 03:13:41
    5286 | insert into problem_table( ...        | 03:13:32
    5492 | insert into problem_table( ...        | 03:11:18
    5323 | insert into problem_table( ...        | 03:09:58
    5285 | insert into problem_table( ...        | 03:09:48
    5337 | insert into problem_table( ...        | 03:09:47
    5356 | insert into problem_table( ...        | 03:08:41
    5303 | insert into problem_table( ...        | 03:08:32
    5488 | insert into problem_table( ...        | 03:06:18
    5333 | insert into problem_table( ...        | 03:04:57
    5293 | insert into problem_table( ...        | 03:04:48
    5339 | insert into problem_table( ...        | 03:04:47
    5361 | insert into problem_table( ...        | 03:03:41
    5305 | insert into problem_table( ...        | 03:03:32
    5491 | insert into problem_table( ...        | 03:01:18
    5316 | insert into problem_table( ...        | 02:59:57
    5297 | insert into problem_table( ...        | 02:59:48
    5330 | insert into problem_table( ...        | 02:59:47
    5357 | insert into problem_table( ...        | 02:58:41
    5300 | insert into problem_table( ...        | 02:58:32
    5310 | insert into problem_table( ...        | 02:56:42
    5489 | insert into problem_table( ...        | 02:56:18
    5315 | insert into problem_table( ...        | 02:54:57
    5289 | insert into problem_table( ...        | 02:54:48
    5332 | insert into problem_table( ...        | 02:54:47
    5365 | insert into problem_table( ...        | 02:53:41
    5307 | insert into problem_table( ...        | 02:53:32
    5497 | insert into problem_table( ...        | 02:51:18
    5329 | insert into problem_table( ...        | 02:49:57
    5326 | insert into problem_table( ...        | 02:49:47
    5291 | insert into problem_table( ...        | 02:49:47
    5364 | insert into problem_table( ...        | 02:48:41
    5288 | insert into problem_table( ...        | 02:48:32
    5487 | insert into problem_table( ...        | 02:46:18
    5318 | insert into problem_table( ...        | 02:44:57
    5336 | insert into problem_table( ...        | 02:44:47
    5280 | insert into problem_table( ...        | 02:44:47
    5367 | insert into problem_table( ...        | 02:43:41
    5294 | insert into problem_table( ...        | 02:43:32
    5495 | insert into problem_table( ...        | 02:41:18
    5327 | insert into problem_table( ...        | 02:39:57
    5338 | insert into problem_table( ...        | 02:39:47
    5295 | insert into problem_table( ...        | 02:39:47
    5360 | insert into problem_table( ...        | 02:38:41
    5290 | insert into problem_table( ...        | 02:38:32
    5247 | insert into problem_table( ...        | 02:37:00
    5493 | insert into problem_table( ...        | 02:36:18
    5325 | insert into problem_table( ...        | 02:34:57
    5334 | insert into problem_table( ...        | 02:34:47
    5281 | insert into problem_table( ...        | 02:34:47
    5366 | insert into problem_table( ...        | 02:33:41
    5296 | insert into problem_table( ...        | 02:33:32
   19725 | insert into problem_table( ...        | 02:31:18
    5317 | insert into problem_table( ...        | 02:29:57
   19756 | insert into problem_table( ...        | 02:29:47
   19739 | insert into problem_table( ...        | 02:29:47
   19829 | insert into problem_table( ...        | 02:28:41
    5292 | insert into problem_table( ...        | 02:28:32
   19888 | insert into problem_table( ...        | 02:26:18
    5236 | insert into problem_table( ...        | 02:25:19
   19738 | insert into problem_table( ...        | 02:24:57
    5272 | insert into problem_table( ...        | 02:24:57
    5267 | insert into problem_table( ...        | 02:24:57
   19853 | insert into problem_table( ...        | 02:24:47
    5283 | insert into problem_table( ...        | 02:24:47
   19967 | insert into problem_table( ...        | 02:23:41
   20000 | insert into problem_table( ...        | 02:23:31
    5275 | insert into problem_table( ...        | 02:21:52
    5257 | insert into problem_table( ...        | 02:21:39
   20004 | insert into problem_table( ...        | 02:21:17
   20121 | insert into problem_table( ...        | 02:19:57
   20005 | insert into problem_table( ...        | 02:19:47
   19966 | insert into problem_table( ...        | 02:19:47
   20023 | insert into problem_table( ...        | 02:18:41
   19887 | insert into problem_table( ...        | 02:18:31
   20127 | insert into problem_table( ...        | 02:16:17
   20274 | insert into problem_table( ...        | 02:14:57
   20272 | insert into problem_table( ...        | 02:14:47
   20068 | insert into problem_table( ...        | 02:14:47
    5363 | insert into problem_table( ...        | 02:13:41
   20173 | insert into problem_table( ...        | 02:13:31
   20003 | insert into problem_table( ...        | 02:11:17
   20273 | insert into problem_table( ...        | 02:09:57
   20124 | insert into problem_table( ...        | 02:09:47
   19863 | insert into problem_table( ...        | 02:09:47
   20308 | insert into problem_table( ...        | 02:08:41
   20395 | insert into problem_table( ...        | 02:08:31
   20493 | insert into problem_table( ...        | 02:06:17
   20002 | insert into problem_table( ...        | 02:04:57
   19999 | insert into problem_table( ...        | 02:04:47
   19862 | insert into problem_table( ...        | 02:04:47
   20022 | insert into problem_table( ...        | 02:03:41
   20316 | insert into problem_table( ...        | 02:03:31
   20541 | insert into problem_table( ...        | 02:01:17
   19852 | insert into problem_table( ...        | 01:59:57
   20555 | insert into problem_table( ...        | 01:59:47
   20275 | insert into problem_table( ...        | 01:59:47
   20637 | insert into problem_table( ...        | 01:58:41
   20662 | insert into problem_table( ...        | 01:58:31
    5311 | insert into problem_table( ...        | 01:56:42
   20686 | insert into problem_table( ...        | 01:56:17
   20748 | insert into problem_table( ...        | 01:54:57
   20704 | insert into problem_table( ...        | 01:54:47
   20673 | insert into problem_table( ...        | 01:54:47
   20520 | insert into problem_table( ...        | 01:53:41
   20796 | insert into problem_table( ...        | 01:53:31
   20823 | insert into problem_table( ...        | 01:51:17
   20926 | insert into problem_table( ...        | 01:49:57
   20810 | insert into problem_table( ...        | 01:49:47
   20672 | insert into problem_table( ...        | 01:49:47
   20774 | insert into problem_table( ...        | 01:48:41
   20948 | insert into problem_table( ...        | 01:48:31
   21041 | insert into problem_table( ...        | 01:46:17
   20945 | insert into problem_table( ...        | 01:44:57
   20974 | insert into problem_table( ...        | 01:44:47
    5324 | insert into problem_table( ...        | 01:44:47
   20868 | insert into problem_table( ...        | 01:43:41
   20824 | insert into problem_table( ...        | 01:43:31
   20809 | insert into problem_table( ...        | 01:41:17
   20944 | insert into problem_table( ...        | 01:39:57
   21113 | insert into problem_table( ...        | 01:39:47
   20943 | insert into problem_table( ...        | 01:39:46
   21180 | insert into problem_table( ...        | 01:38:41
   21032 | insert into problem_table( ...        | 01:38:31
   19671 | insert into problem_table( ...        | 01:37:00
   21334 | insert into problem_table( ...        | 01:36:17
   20942 | insert into problem_table( ...        | 01:34:57
   21355 | insert into problem_table( ...        | 01:34:47
   21098 | insert into problem_table( ...        | 01:34:46
   21182 | insert into problem_table( ...        | 01:33:40
   21031 | insert into problem_table( ...        | 01:33:31
   21362 | insert into problem_table( ...        | 01:31:17
   20542 | insert into problem_table( ...        | 01:29:57
   20825 | insert into problem_table( ...        | 01:29:47
   21361 | insert into problem_table( ...        | 01:29:46
   21181 | insert into problem_table( ...        | 01:28:40
    5298 | insert into problem_table( ...        | 01:28:31
   21479 | insert into problem_table( ...        | 01:26:17
   19988 | insert into problem_table( ...        | 01:25:19
   21473 | insert into problem_table( ...        | 01:24:57
   19990 | insert into problem_table( ...        | 01:24:57
   19989 | insert into problem_table( ...        | 01:24:57
   21474 | insert into problem_table( ...        | 01:24:47
   20927 | insert into problem_table( ...        | 01:24:46
   20973 | insert into problem_table( ...        | 01:23:40
   21605 | insert into problem_table( ...        | 01:23:31
   20069 | insert into problem_table( ...        | 01:21:52
   20096 | insert into problem_table( ...        | 01:21:39
   21607 | insert into problem_table( ...        | 01:21:17
   21606 | insert into problem_table( ...        | 01:19:57
   21740 | insert into problem_table( ...        | 01:19:47
   21683 | insert into problem_table( ...        | 01:19:46
   21716 | insert into problem_table( ...        | 01:18:40
   21505 | insert into problem_table( ...        | 01:18:31
   21757 | insert into problem_table( ...        | 01:16:17
   21615 | insert into problem_table( ...        | 01:14:57
   21744 | insert into problem_table( ...        | 01:14:47
   21584 | insert into problem_table( ...        | 01:14:46
   21756 | insert into problem_table( ...        | 01:13:40
   21755 | insert into problem_table( ...        | 01:13:31
    5494 | insert into problem_table( ...        | 01:11:17
   21616 | insert into problem_table( ...        | 01:09:57
   21356 | insert into problem_table( ...        | 01:09:47
   21745 | insert into problem_table( ...        | 01:09:46
   21506 | insert into problem_table( ...        | 01:08:40
   21972 | insert into problem_table( ...        | 01:08:31
   21824 | insert into problem_table( ...        | 01:06:17
   21638 | insert into problem_table( ...        | 01:04:56
   21743 | insert into problem_table( ...        | 01:04:47
   21974 | insert into problem_table( ...        | 01:04:46
   21973 | insert into problem_table( ...        | 01:03:40
   22013 | insert into problem_table( ...        | 01:03:31
   22105 | insert into problem_table( ...        | 01:01:17
   22012 | insert into problem_table( ...        | 00:59:56
   22138 | insert into problem_table( ...        | 00:59:47
   22137 | insert into problem_table( ...        | 00:59:46
   22016 | insert into problem_table( ...        | 00:58:40
   21719 | insert into problem_table( ...        | 00:58:31
   20775 | insert into problem_table( ...        | 00:56:42
   22084 | insert into problem_table( ...        | 00:56:17
   22270 | insert into problem_table( ...        | 00:54:56
   22273 | insert into problem_table( ...        | 00:54:47
   22269 | insert into problem_table( ...        | 00:54:46
   22209 | insert into problem_table( ...        | 00:53:40
   22375 | insert into problem_table( ...        | 00:53:31
   22230 | insert into problem_table( ...        | 00:51:17
   22409 | insert into problem_table( ...        | 00:49:56
   22376 | insert into problem_table( ...        | 00:49:47
   22009 | insert into problem_table( ...        | 00:49:46
   22408 | insert into problem_table( ...        | 00:48:40
   22134 | insert into problem_table( ...        | 00:48:31
   22549 | insert into problem_table( ...        | 00:46:17
   22240 | insert into problem_table( ...        | 00:44:56
   22001 | insert into problem_table( ...        | 00:44:47
   22282 | insert into problem_table( ...        | 00:44:46
   22413 | insert into problem_table( ...        | 00:43:40
   22516 | insert into problem_table( ...        | 00:43:31
   22676 | insert into problem_table( ...        | 00:41:17
   22517 | insert into problem_table( ...        | 00:39:56
   22546 | insert into problem_table( ...        | 00:39:47
   22646 | insert into problem_table( ...        | 00:39:46
   22785 | insert into problem_table( ...        | 00:38:40
   22412 | insert into problem_table( ...        | 00:38:31
   21300 | insert into problem_table( ...        | 00:37:00
   22493 | insert into problem_table( ...        | 00:36:17
   22827 | insert into problem_table( ...        | 00:34:56
   22788 | insert into problem_table( ...        | 00:34:47
   22564 | insert into problem_table( ...        | 00:34:46
   22279 | insert into problem_table( ...        | 00:33:40
   22820 | insert into problem_table( ...        | 00:33:31
   22722 | insert into problem_table( ...        | 00:31:17
   22642 | insert into problem_table( ...        | 00:29:56
   22673 | insert into problem_table( ...        | 00:29:47
   22921 | insert into problem_table( ...        | 00:29:46
   22983 | insert into problem_table( ...        | 00:28:40
   22952 | insert into problem_table( ...        | 00:28:31
   23047 | insert into problem_table( ...        | 00:26:17
   21608 | insert into problem_table( ...        | 00:25:19
   21610 | insert into problem_table( ...        | 00:24:57
   21609 | insert into problem_table( ...        | 00:24:57
   23046 | insert into problem_table( ...        | 00:24:56
   22953 | insert into problem_table( ...        | 00:24:47
   22645 | insert into problem_table( ...        | 00:24:46
   22819 | insert into problem_table( ...        | 00:23:40
   23089 | insert into problem_table( ...        | 00:23:31
   21684 | insert into problem_table( ...        | 00:21:52
   21713 | insert into problem_table( ...        | 00:21:39
   23102 | insert into problem_table( ...        | 00:21:17
   23214 | insert into problem_table( ...        | 00:19:56
   23076 | insert into problem_table( ...        | 00:19:47
   22968 | insert into problem_table( ...        | 00:19:46
   23177 | insert into problem_table( ...        | 00:18:40
   23209 | insert into problem_table( ...        | 00:18:30
   23303 | insert into problem_table( ...        | 00:16:17
   22950 | insert into problem_table( ...        | 00:14:56
   23207 | insert into problem_table( ...        | 00:14:47
   23344 | insert into problem_table( ...        | 00:14:46
   23224 | insert into problem_table( ...        | 00:13:40
   22682 | insert into problem_table( ...        | 00:13:30
   22951 | insert into problem_table( ...        | 00:11:17
   23339 | insert into problem_table( ...        | 00:09:56
   23462 | insert into problem_table( ...        | 00:09:47
   23225 | insert into problem_table( ...        | 00:09:46
   23360 | insert into problem_table( ...        | 00:08:40
   23208 | insert into problem_table( ...        | 00:08:30
   23393 | insert into problem_table( ...        | 00:06:17
   23599 | insert into problem_table( ...        | 00:04:56
   23343 | insert into problem_table( ...        | 00:04:47
   23349 | insert into problem_table( ...        | 00:04:46
   23359 | insert into problem_table( ...        | 00:03:40
   23596 | insert into problem_table( ...        | 00:03:30
   23503 | insert into problem_table( ...        | 00:01:17
(282 rows)
 



dbprod=# execute locks(5239);
     relname     | relation  | database  | transaction | pid  |       mode       | granted
-----------------+-----------+-----------+-------------+------+------------------+---------
                 |           |           |   141488647 | 5239 | ExclusiveLock    | t
 problem_table   | 701796052 | 188216764 |             | 5239 | AccessShareLock  | t
 problem_table   | 701796052 | 188216764 |             | 5239 | RowShareLock     | t
 problem_table   | 701796052 | 188216764 |             | 5239 | RowExclusiveLock | t
 pk_problem_table| 897539063 | 188216764 |             | 5239 | AccessShareLock  | t
 pk_problem_table| 897539063 | 188216764 |             | 5239 | RowExclusiveLock | t
 pk_other_table  | 897539027 | 188216764 |             | 5239 | AccessShareLock  | t
 other_table     | 701795713 | 188216764 |             | 5239 | AccessShareLock  | t
 third_table     | 701795842 | 188216764 |             | 5239 | AccessShareLock  | t
 problem_table   | 701796052 | 188216764 |             | 5239 | ExclusiveLock    | t
 pk_third_table  | 897539057 | 188216764 |             | 5239 | AccessShareLock  | t
(11 rows)
 

dbprod=# select pg_cancel_backend(5239);
 pg_cancel_backend
-------------------
                 1
(1 row)


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux