The table has around 1.5M rows which have been updated/inserted around
121M times, the distribution of updates to row in alerts_alert will be
quite uneven, from 1 insert up to 1 insert and 0.5M updates.
Under high load (200-300 inserts/updates per second) we see occasional
(~10 per hour) updates taking excessively long times (2-10s). These
updates are always of the form:
UPDATE "alerts_alert" SET ...bunch of fields... WHERE
"alerts_alert"."id" = '...sha1 hash...';
Here's a sample explain:
https://explain.depesz.com/s/Fjq8
What could be causing this? What could we do to debug? What config
changes could we make to alleviate this?
Hello Chris,
One of the reasons could be the row already locked by another backend,
doing the same kind of an update or something different.
Are these updates performed in a longer transactions?
Can they hit the same row from two clients at the same time?
Is there any other write or select-for-update/share load on the table?
Have you tried periodical logging of the non-granted locks?
Try querying pg_stat_activity and pg_locks (possibly joined and maybe
repeatedly self-joined, google for it)
to get the backends that wait one for another while competing for to
lock the same row or object.
Best,
Alex