On 1/25/07, Weslee Bilodeau <weslee.bilodeau@xxxxxxxxxxxxxxxxxxxxx> wrote:
Where I work I'm in charge of more then a few PostgreSQL databases. I understand why idle in transaction is bad, however I have some developers who I'm having a real difficult time fully explaining to them why its bad. Oh, and by bad I mean they have transactions that are sitting idle for 6+ hours at a time. Mainly because they don't speak very good English, and my words like MVCC and VACUUM have them tilting their heads wondering what language I'm speaking. I've tried searching the mailing lists for a good explanation, but haven't really found one thats easy to translate. They are Japanese, but I don't speak Japanese, so finding any resource in Japanese that explains it is beyond my ability. Would anyone happen to have a simple explanation, or a page online thats written in Japanese that I can pass off that might explain why this is bad? Is there a Wiki somewhere that says "101 ways to cause your DBA an aneurysm" that covers things like this? :)
Long running transactions (waiting on user input especially) are bad in practice and also in principle. The purpose of transactions is to accumulate multiple changes to a datastore so that you can never catch it in an invalid state. MVCC allows the database to do this while providing concurrency...the whole point of that is to keep as few locks for the shortest term possible. A major challenge in high activity databases is to keep information consistent and valid at all times while at the same time minimizing contention to high traffic objects. If you must keep long-term locks, check out advisory locks (8.2+): Keep in mind that while advisory locks avoid some of the headaches associated with long transactions they have concurrency issues at the same time. I wrote a little bit about them (shameless plug:) http://merlinmoncure.blogspot.com/2006/12/postgresql-8.html http://merlinmoncure.blogspot.com/2006/12/advisory-locks-part-2.html Most applications do not require long term (aka pessimistic) locks: they are a crutch. A more elegant solution is to provide an update mechanism such that the application alerts the user that the data has changed out from under them (meaning, the user's terminal has stale data) before sending it back to the server. This provides an opportunity to merge changes or pick one set of data over another. At the very least, it provides a way to minimize contention to when data is actually changed instead of broad high level locks by simply viewing data. merlin