The count is only for a 10-second snapshot.. Plus remember there are
about 1000 users running so the connection being profiled only gets
0.01 of the period on CPU.. And the count is for that CONNECTION only.
Anyway using the lock wait script it shows the real picture as you
requested. Here the combined time means time "spent waiting" for the lock.
bash-3.00# echo 500 users
500 users
bash-3.00# ./4_lwlock_waits.d 20764
Lock Id Mode Count
OidGenLock Exclusive 1
CheckpointStartLock Shared 3
OidGenLock Shared 4
WALInsertLock Exclusive 7
FreeSpaceLock Exclusive 8
XidGenLock Exclusive 15
CheckpointStartLock Exclusive 16
Lock Id Combined Time (ns)
XidGenLock 3825000
OidGenLock 5307100
WALInsertLock 6317800
FreeSpaceLock 7244100
CheckpointStartLock 22199200
bash-3.00# echo 600 users
600 users
bash-3.00# ./4_lwlock_waits.d 20764
Lock Id Mode Count
OidGenLock Exclusive 1
WALInsertLock Exclusive 1
CheckpointStartLock Shared 4
CheckpointStartLock Exclusive 11
XidGenLock Exclusive 21
Lock Id Combined Time (ns)
OidGenLock 1728200
WALInsertLock 2040700
XidGenLock 19878500
CheckpointStartLock 24156500
bash-3.00# echo 700 users
700 users
bash-3.00# ./4_lwlock_waits.d 20764
Lock Id Mode Count
OidGenLock Shared 1
XidGenLock Shared 1
CheckpointStartLock Shared 2
WALInsertLock Exclusive 4
CheckpointStartLock Exclusive 6
FreeSpaceLock Exclusive 6
XidGenLock Exclusive 13
Lock Id Combined Time (ns)
OidGenLock 1730000
WALInsertLock 7253400
FreeSpaceLock 10977700
CheckpointStartLock 13356800
XidGenLock 38220500
bash-3.00# echo 800 users
800 users
bash-3.00# ./4_lwlock_waits.d 20764
Lock Id Mode Count
CheckpointStartLock Shared 1
WALInsertLock Exclusive 2
XidGenLock Shared 2
CheckpointStartLock Exclusive 5
FreeSpaceLock Exclusive 8
XidGenLock Exclusive 12
Lock Id Combined Time (ns)
WALInsertLock 3746800
FreeSpaceLock 7628900
CheckpointStartLock 11297500
XidGenLock 16649000
bash-3.00# echo 900 users - BIG DROP IN THROUGHPUT OCCURS...
900 users
bash-3.00# ./4_lwlock_waits.d 20764
Lock Id Mode Count
OidGenLock Exclusive 1
OidGenLock Shared 1
XidGenLock Shared 1
FreeSpaceLock Exclusive 2
WALInsertLock Exclusive 2
CheckpointStartLock Shared 6
XidGenLock Exclusive 12
CheckpointStartLock Exclusive 121
Lock Id Combined Time (ns)
OidGenLock 1968100
FreeSpaceLock 2076300
WALInsertLock 2190400
XidGenLock 20259400
CheckpointStartLock 1407294300
bash-3.00# echo 950 users
950 users
bash-3.00# ./4_lwlock_waits.d 20764
Lock Id Mode Count
OidGenLock Exclusive 1
OidGenLock Shared 2
CheckpointStartLock Shared 3
WALInsertLock Exclusive 4
FreeSpaceLock Exclusive 5
XidGenLock Exclusive 11
CheckpointStartLock Exclusive 50
Lock Id Combined Time (ns)
WALInsertLock 5577100
FreeSpaceLock 9115900
XidGenLock 13765800
OidGenLock 50155500
CheckpointStartLock 759872200
bash-3.00# echo 1000 users
1000 users
bash-3.00# ./4_lwlock_waits.d 20764
Lock Id Mode Count
OidGenLock Shared 1
WALInsertLock Exclusive 1
XidGenLock Exclusive 5
CheckpointStartLock Shared 6
CheckpointStartLock Exclusive 102
Lock Id Combined Time (ns)
OidGenLock 21900
WALInsertLock 82500
XidGenLock 3313400
CheckpointStartLock 1448289900
bash-3.00# echo 1050 users
1050 users
bash-3.00# ./4_lwlock_waits.d 20764
Lock Id Mode Count
FreeSpaceLock Exclusive 1
CheckpointStartLock Shared 3
XidGenLock Exclusive 3
CheckpointStartLock Exclusive 146
Lock Id Combined Time (ns)
FreeSpaceLock 18400
XidGenLock 1900900
CheckpointStartLock 2392893700
bash-3.00#
-Jignesh
Tom Lane wrote:
"Jignesh K. Shah" <J.K.Shah@xxxxxxx> writes:
Here is how I got the numbers..
I had about 1600 users login into postgresql. Then started the run with
500 users and using DTrace I started tracking Postgresql Locking "as
viewed from one user/connection". Echo statements indicate how many
users were active at that point and how was throughput performing. All
IO is done on /tmp which means on a RAM disk.
bash-3.00# echo 500 users - baseline number
500 users
bash-3.00# ./3_lwlock_acquires.d 19178
I don't think I believe these numbers. For one thing, CheckpointLock
is simply not ever taken in shared mode. The ratios of counts for
different locks seems pretty improbable too, eg there is no way on
earth that the LockMgrLocks are taken more often shared than
exclusive (I would expect no shared acquires at all in the sort of
test you are running). Not to mention that the absolute number of
counts seems way too low. So I think the counting tool is broken.
Combined time of what exactly? It looks like this must be the total
duration the lock is held, at least assuming that the time for
CheckpointLock is correctly reported. It'd be much more useful to see
the total time spent waiting to acquire the lock.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
message can get through to the mailing list cleanly