On 06/04/2014 10:55 AM, Tom Lane wrote:
Ryan Johnson <ryan.johnson@xxxxxxxxxxxxxx> writes:
I get a strange behavior across repeated runs: each 100-second run is a
bit slower than the one preceding it, when run with SSI (SERIALIZABLE).
... So the question: what should I look for to diagnose/triage this problem?
In the past I've seen behaviors like this that traced to the range of
"interesting" transaction IDs getting wider as time went on, so that
more pages of pg_clog were hot, leading to more I/O traffic in the
clog SLRU buffers. Possibly this is some effect like that.
The effect disappears completely if I run under SI instead of SSI,
though. That makes me suspect strongly that the problem lurks in
SSI-specific infrastrucure.
However, I did notice that the SLRU buffer that holds "old" SSI
transactions sometimes spikes from hundreds to millions of entries (by
annotating the source to ereport a warning whenever the difference
between buffer head and tale is at least 10% higher than the previous
record). Not sure if that's related, though: I'm pretty sure SSI never
scans SLRU, it's only used for random lookups.
I've tried linux perf, but all it says is that lots of time is going to
LWLock (but callgraph tracing doesn't work in my not-bleeding-edge
kernel).
You could recompile with -DLWLOCK_STATS to enable gathering stats on
which LWLocks are heavily contended. That would be a starting point
for localizing the cause.
Here are the offenders (100-second run, 24 clients, ~2.2ktps):
lwlock 7 shacq 0 exacq 7002810 blk 896235 spindelay 213
lwlock 28 shacq 94984166 exacq 3938085 blk 572762 spindelay 163
lwlock 65 shacq 3347000 exacq 2933440 blk 255927 spindelay 90
lwlock 79 shacq 1742574 exacq 3009663 blk 216178 spindelay 41
lwlock 76 shacq 2293386 exacq 2892242 blk 205199 spindelay 70
lwlock 66 shacq 498909 exacq 2987485 blk 171508 spindelay 48
lwlock 80 shacq 512107 exacq 3181753 blk 165772 spindelay 43
lwlock 71 shacq 815733 exacq 3088157 blk 165579 spindelay 48
lwlock 74 shacq 603321 exacq 3065391 blk 159953 spindelay 56
lwlock 67 shacq 695465 exacq 2918970 blk 149339 spindelay 28
lwlock 69 shacq 411203 exacq 3044007 blk 148655 spindelay 34
lwlock 72 shacq 515260 exacq 2973321 blk 147533 spindelay 43
lwlock 30 shacq 41628636 exacq 8799 blk 143889 spindelay 186
lwlock 75 shacq 409472 exacq 2987227 blk 143196 spindelay 38
lwlock 77 shacq 409401 exacq 2946972 blk 139507 spindelay 34
lwlock 73 shacq 402544 exacq 2943467 blk 139380 spindelay 43
lwlock 78 shacq 404220 exacq 2912665 blk 137625 spindelay 21
lwlock 70 shacq 603643 exacq 2816730 blk 135851 spindelay 37
lwlock 68 shacq 403533 exacq 2862017 blk 131946 spindelay 30
lwlock 29 shacq 0 exacq 255302 blk 75838 spindelay 1
lwlock 0 shacq 0 exacq 561508 blk 12445 spindelay 3
lwlock 11 shacq 1245499 exacq 219717 blk 5501 spindelay 10
lwlock 4 shacq 381211 exacq 209146 blk 1273 spindelay 4
lwlock 3 shacq 16 exacq 209081 blk 522 spindelay 0
lwlock 8 shacq 0 exacq 137961 blk 50 spindelay 0
lwlock 2097366 shacq 0 exacq 384586 blk 1 spindelay 0
lwlock 2097365 shacq 0 exacq 370176 blk 1 spindelay 0
lwlock 2097359 shacq 0 exacq 363845 blk 1 spindelay 0
The above aggregates the per-lock stats from all processes, filters out
locks with fewer than 10000 accesses (shared+exclusive) or with zero
blk, then sorts by highest blk first.
According to [1], locks {28, 29, 30} are {SerializableXactHashLock,
SerializableFinishedListLock, SerializablePredicateLockListLock}, all
SSI-related; locks 65-80 are the sixteen PredicateLockMgrLocks that the
post mentions. Looking in lwlock.h, lock 7 (which tops the list) is the
WALInsertLock. That lock was *not* mentioned in the pgsql-hackers post.
Re-running the same analysis for SI instead of SSI gives 4.6ktps and a
much shorter list:
lwlock 7 shacq 0 exacq 14050121 blk 3429384 spindelay 347
lwlock 11 shacq 3133994 exacq 450325 blk 23456 spindelay 29
lwlock 0 shacq 0 exacq 684775 blk 19158 spindelay 3
lwlock 4 shacq 780846 exacq 428771 blk 4539 spindelay 6
lwlock 3 shacq 19 exacq 428705 blk 1147 spindelay 0
lwlock 59 shacq 0 exacq 125943 blk 203 spindelay 0
lwlock 8 shacq 0 exacq 287629 blk 116 spindelay 0
lwlock 2097358 shacq 0 exacq 752361 blk 1 spindelay 0
lwlock 2097355 shacq 0 exacq 755086 blk 1 spindelay 0
lwlock 2097352 shacq 0 exacq 760232 blk 1 spindelay 0
However, all of this only confirms that SSI has a lock bottleneck; it
doesn't say why the bottleneck gets worse over time.
[1]
http://www.postgresql.org/message-id/CA+TgmoYAiSM2jWEndReY5PL0sKbhgg7dbDH6r=oXKYzi9B7KJA@xxxxxxxxxxxxxx
Thoughts?
Ryan
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance