Kevin Brown wrote:
I've been creating a databased application with PostgreSQL for a while now and
have loved it, though we just recently stumbled on an interesting bug.
I have a particular dialog which can do all kinds of incredibly complicated
things to the database. Of course I didn't want to have to write the "undo
it to the database" code, nor did I want to force the user into being screwed
with a stray click.
So when the dialog opens, it executes a BEGIN, and then based on whether they
clicked ok or cancel on the dialog, it'd COMMIT or ROLLBACK. This worked
fine for me in testing, but now that I have two people using this app
simulatneously, if they both click on ok at the same time, postgres seemingly
deadlocks. My setup is such that I have n ruby clients talking with DRb to
another ruby process on the server, which then does all the communication to
postgres. Each user has his/her own database connection.
What ruby database library are you using?
Since ruby doesn't use native threads, but instead implements it's own threads (purely
in user-space), one thread can block your whole app - e.g, if the thread tries to read
from a socket where not data is available, and therefor the kernel puts the process
to sleep until data becomes available.
For the "normal" read() and write() operations exposed to the ruby programming this
is solved internally by ruby (it checks the filedescriptions state, and only reads
if the read won't block). But extensions, especially the ones talking over the network,
like the postgres client, can still cause this problem. In the "raw" postgres client
extension for ruby, you'll find two methods that can execute a query - query() and
query_async(). query() will block the _whole_ ruby interpreter until query results are
available, while query_async() will block only the thread that executed the query.
This one-thread-blocks-all behaviour can lead to "interesting" dead-locks. Assume,
for example, the following situation, with A and B being two ruby threads.
A1) update table where id=1
B1) update table where id=1
-----
B2) commit
A2) commit
Now, the query B1 will block until A2 is executed (because the record is already locked).
But if B1 blocks, and you used query() instead of query_async(), then the _whole_
ruby interpreter is blocked, and therefor A2 will never be executed. Bingo! you got
a deadlock, and postgres has no chance to ever detect this.
Of course, not using threads makes the problem even worse - the only solution is to
use the _async() functions. DBI lets you set a flag (forgot it's name) that forces
DBD::Pg to use the _async() functions - if you don't use DBI just replace all calls
to query() and exec() with query_async() and exec_async().
greetings, Florian Pflug