Re: Understanding updates and deadlocks more clearly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/2/22 16:15, Wells Oliver wrote:
Hey folks: I want to clarify a few things about locks on updates:

1- can two processes update the same table if they are updating different rows?

You can test this yourself by trying it,


2- in the event of two processes trying to update the same row, does this always result in a deadlock?

If txn 1 starts, updates TABLE_A then TABLE_B and commits, while "at the same time" txn 2 (say it's the same code but a different end user processing different data) does the same thing to the same or similar rows, you might have a lock conflict (depending on RDBMS, transaction type, type of indices, length of transaction, how well it's written, etc).

OTOH, you might get a deadlock if If txn 1 starts, updates TABLE_A then TABLE_B and commits, while "at the same time" txn 2 updates TABLE_B then TABLE_A.

In that case, txn 1 locks a chunk of TABLE_A, while txn 2 locks a chunk of TABLE_B, and then txn 1 tries to lock TABLE_B but can't, because txn 2 has already locked it.  "Simultaneously", txn 2 tries to lock TABLE_A but can't, because txn 1 has already locked it.

That's a simplistic form of deadlock, which for 25 years competent DBMSs have been able to avoid.  (Even so, you can sometimes force the issue by using SERIALIZABLE transaction mode.)


3- is there a way to create a system whereby potentially concurrent processes can update the same row in a given table, can the server be made to queue those updates, etc?

This is what DBMSs have been doing for 50+ years.  Of course a single CPU system can by definition only do one thing at a time; while multi-CPU systems can execute multiple CPU op-codes at once, at some point you'll get to a serialized bottleneck.  For example, the disk controller can only write one "thing" at at time to a disk block.

Thus, at some deep stage, everything is serialized.  Having said that, there are many layers between your application sending SQL code to Postgresql, and there hints that you can provide the RDBMS, and also there are hints that the RDBMS can figure out itself.

PostgreSQL uses MVCC (multi-version concurrency control, which keeps multiple versions of a record on disk; each transaction can only see one version, and at some point some of the record versions are marked "dead", and they just sit there wasting space and IO (since disks read blocks, not Postgresql records).  This is why VACUUM is so important in UPDATE/DELETE heavy tables.

Transaction length also plays into concurrency (or lack thereof).  A txn that starts, does some quick DB changes (having done as much work as possible beforehand) and then commits allows for faster concurrency than one that starts, does some work, does a bunch of unrelated tasks, inefficiently modifies the database, and then commits.  Or worse, starts a transaction, does some work, and then waits to commit until the user (who just went to lunch!!! presses a key.

--
Angular momentum makes the world go 'round.

[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux