On Sun,
Sep 16, 2018 at 3:53 PM marcelo <marcelo.nicolet@xxxxxxxxx>
wrote:
>
> I need a mechanism of "logical locking" more ductile
than the pg_advisory family.
> I'm thinking of a table ("lock_table") that would be
part of the database, with columns
> * tablename varchar - name of the table "locked"
> * rowid integer, - id of the row "locked"
> * ownerid varchar, - identifier of the "user" who
acquired the lock
> * acquired timestamp - to be able to release
"abandoned" locks after a certain time
>
> and a group of functions
> 1) lock_table (tablename varchar, ownerid varchar) bool
- get to lock over the entire table, setting rowid to zero
> 2) unlock_table (tablename varchar, ownerid varchar)
bool - unlock the table, if the owner is the recorded one
> 3) locked_table (tablename varchar, ownerid varchar)
bool - ask if the table is locked by some user other than
the ownerid argument
> 4) lock_row (tablename varchar, rowid integer, ownerid
varchar) bool - similar to pg_try_advisory_lock
> 5) unlock_row (tablename varchar, rowid integer,
ownerid varchar) bool - similar to pg_advisory_unlock
> 6) unlock_all (ownerid varchar) bool - unlock all locks
owned by ownerid
>
> The timeout (default, maybe 15 minutes) is implicitly
applied if the lock is taken by another user (there will be
no notification).
> Redundant locks are not queued, they simply return
true, may be after an update of the acquired column.
> Successful locks insert a new row, except the rare case
of a timeout, which becomes an update (ownerid and acquired)
> Unlock operations deletes the corresponding row
>
> My question is double
> a) What is the opinion on the project?
> b) What are the consequences of the large number of
inserts and deletions
> c) Performance. In fact, pg_advisory* implies a network
roundtrip, but (I think) no table operations.
Why can't you use the advisory lock functions? The
challenge with
manually managed locks are they they are slow and you will
lose the
coordination the database provides you. For example, if
your
application crashes you will have to clean up all held locks
yourself.
Building out that infrastructure will be difficult.
First, I think in an ideal world, you wouldn't handle
this problem with either approach but sometimes you have to.
I have done both approaches actually. LedgerSMB uses its
own lock table because locks have to persist across multiple
HTTP requests and we have various automatic cleanup
processes.
When I was working on the queue management stuff at
Novozymes we used advisory locks extensively.
These two approaches have serious downsides:
1. Lock tables are *slow* and require careful thinking
through cleanup scenarios. In LedgerSMB we tied to the
application session with an ON DELETE event that would
unlock the row. We estimated that for every 2 seconds that
the db spent doing useful work, it spent 42 seconds managing
the locks..... Additionally the fact that locks take effect
on snapshot advance is a problem here.
2. In my talk, "PostgreSQL at 10TB and Beyond" I talk
about a problem we had using advisory locks for managing
rows that were being processed for deletion. Since the
deletion was the scan for items at the head of an index,
under heavy load we could spend long enough checking dead
rows that the locks could go away with our snapshot failing
to advance. This would result in duplicate processing. So
the fact that advisory locks don't really follow snapshot
semantics is a really big problem here since it means you
can have race conditions in advisory locks that can't happen
with other locking issues. I still love advisory locks but
they are not a good tool for this.
The real solution most of the time is actually to lock
the rows by selecting FOR UPDATE and possibly SKIP LOCKED.
The way update/delete row locking in PostgreSQL works is
usually good enough except in a few rare edge cases. Only
in *very rare* cases do lock tables or advisory locks make
sense for actual row processing.
merlin