Hi All,
We have a setup where 2 JBoss (5.1)
servers communicate with 1 instance of PgPool (3.04), which again communicates
with 2 Postgresql (8.4) servers. The JBoss servers host some Java code
for us and as part of that they run some quartz jobs.
These jobs are triggered right after
startup and as part of that we get what seems to get stuck. At least when
we can see in the database that when inspecting pg_locks, there exists
a virtual transaction that has all desired locks granted but seems to be
stuck. When we inspect pg_stat_activity, it seems that the process is still
waiting for the query (SELECT ... FOR UPDATE) to finish.
The locking transaction is described
here: http://pastebin.com/3pEn6vPe
We know that the quartz thread is attempting
to obtain a row share lock. We know that we have enough connections available
in postgres and in pgpool. We also know that the issue occurs much more
frequently when we enable postgres statememt logging. We assume that this
is due to postgres becomming slower as a result of the additionsl logging.
When we look at the server thread dump, we can see that all quartz threads
are either sleeping or waiting for postgres.
A thread dump of the relevant quartz
threads is described here: http://pastebin.com/iPhuFLrM
It is important to note that the issue
does not only occur with quartz jobs, but it is where we see it most frequently.
This is likely to be due to the fact that it is the place where we have
the highest level of concurrency.
We suspect that a connection to the
database acquires its locks but somehow does not return to the application.
If this is true, it would either be a postgresql or a pgpool problem. We
would appreciate any help in further debugging or resolving the situation.
Kind regards,
Fredrik