On Wed, Jun 7, 2017 at 9:16 AM, David Rosenstrauch <darose@xxxxxxxxxx> wrote: > I'm running a Spark job that is writing to a postgres db (v9.6), using the > JDBC driver (v42.0.0), and running into a puzzling error: > > 2017-06-06 16:05:17.718 UTC [36661] dmx@dmx ERROR: deadlock detected > 2017-06-06 16:05:17.718 UTC [36661] dmx@dmx DETAIL: Process 36661 waits for > ExclusiveLock on advisory lock [16649,0,102440,5]; blocked by process 36662. > Process 36662 waits for ExclusiveLock on advisory lock > [16649,0,102439,5]; blocked by process 36661. > > However, I can't for the life of me figure out a) how the advisory locks are > getting created (as I'm not calling for them myself), and b) how to fix this > issue. > > > A bit of background: My Spark job runs as multiple processes on multiple > machines. Each process is performing the writes to pgsql using the jdbc > driver. The writes are performed a) as PG UPSERTS, b) as JDBC batches, and > c) using JDBC prepared statements. So each process, when it's time to write > to the db, creates several prepared statements, adds a bunch of UPSERTs to > each prepared statement (i.e., each prepared statement contains a batch of a > few hundred UPSERTs), and then performs an executeBatch() on each statement > to perform the write. That executeBatch() call is where I'm running into > the error. > > In theory, since there's multiple processes that are issuing these batched > DB writes, there could be a record locking problem if, say, 2 processes > tried to perform updates to the same user record. But in reality this > should be impossible. Spark partitions everything based on a key - in my > case userID - so all DB writes for the same user should be happening in the > same process. So at worst I could just have a batch that contains multiple > UPSERTs to the same user record, but I should never be seeing updates to the > same user from different processes. > > > So, I'm very puzzled by that deadlock error. Specifically: > > * How could it be possible that there are 2 PG processes trying to acquire > the same lock? Spark's partitioning should ensure that all updates to the > same user record get routed to the same process, so this situation shouldn't > even be possible. That's really a question for the Spark team. Obviously they are -- advisory locks lay on top of the basic locking mechanics and are very well tested and proven. What I can tell you is that in the core functions provided by postgres there are no advisory locks thrown -- you own the locking space (that is, code under your control). > * How/why am I winding up acquiring advisory locks in the first place? I'm > never requesting them. I looked at the PG JDBC driver code a bit, thinking > that it might automatically be creating them for some reason, but that > doesn't seem to be the case. Maybe the PG database itself is? (E.g., Does > the PG automatically use advisory locks with UPSERTs?) Some code under your control is. This could be an external module, application code, or an sproc. > And, last but not least: > > * How do I resolve this "waits for ExclusiveLock on advisory lock" issue? > There's precious little info available regarding exactly what that error > message is and how to solve. Barring some reponse from Spark team, here is how I would narrow the problem down: *) lets search the contents of pg_proc for functions calling advisory locks: SELECT * FROM pg_proc where prosrc ~* 'advisory'; that might turn up some 3rd party code hits *) turn on statement level logging and in bash: tail -f postgres_xx.log | grep -i advisory *) repeatedly query pg_locks for locktype = 'advisory' SELECT * FROM pg_locks where locktype = 'advisory' also, SELECT * FROM pg_locks WHERE locktype = 'advisory' AND NOT granted; Advisory locks are a very blunt instrument and it's a significant risk that two different locking systems are stepping on each other's toes. I do not recommend using them (especially non-xact variant) unless you have total control over all the code potentially throwing locks and have a good understanding of interactions with connection poolers between locking code and the database. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general