Re: LISTEN / NOTIFY performance in 8.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 6:01 PM -0500 2/26/08, Tom Lane wrote:
Hmm, that says that it's not a matter of locking on pg_listener,
but of actually applying the row update(s) and/or signaling the
recipient(s).  If you're not seeing performance issues for ordinary
table-update operations it's hard to see why pg_listener updates would
be any worse, so that seems to point the finger at the signaling.
Which is just a matter of a kill(2) and shouldn't be that expensive.

It might be interesting to try strace'ing the whole PG process tree
while these notifies are going on, and seeing if you can identify
any specific kernel calls that seem to take a long time.

I ran PG via strace and then ran the test script at 25 consumers looping 25 times each. There were no other connections to the database for this strace run and test.

Digging through the strace file is a bit mind-numbing but here's some signs that semop and send are the culprits:

3 misc examples coming from near LISTEN or NOTIFIES:
- - - - - -
7495  18:10:40.251855 <... semop resumed> ) = 0 <1.006149>
7495 18:10:41.325442 <... semop resumed> ) = -1 EINTR (Interrupted system call) <0.299135>
7495  18:10:41.998219 <... semop resumed> ) = 0 <0.603566>

A chunk of log following the action on fd 7 (slow recv on ERESTARTSYS) and then the slow semop that follows:
- - - - - - - -
7495 18:10:42.576401 send(7, "C\0\0\0\vNOTIFY\0Z\0\0\0\5I", 18, 0 <unfinished ...>
7495  18:10:42.576503 <... send resumed> ) = 18 <0.000070>
7495  18:10:42.576620 recv(7,  <unfinished ...>
7495 18:10:42.873796 <... recv resumed> 0x8331d40, 8192, 0) = ? ERESTARTSYS (To be restarted) <0.297158>
7495  18:10:42.873911 --- SIGUSR2 (User defined signal 2) @ 0 (0) ---
7495  18:10:42.874079 gettimeofday( <unfinished ...>
7495 18:10:42.874198 <... gettimeofday resumed> {1204078242, 874097}, NULL) = 0 <0.000101> 7495 18:10:42.874324 setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={1, 0}}, <unfinished ...>
7495  18:10:42.874470 <... setitimer resumed> NULL) = 0 <0.000121>
7495  18:10:42.874604 semop(50495522, 0xbfff9764, 1 <unfinished ...>
7495  18:10:43.678431 <... semop resumed> ) = 0 <0.803809>

A little further on:
- - - - - - - -
7495 18:10:44.905320 <... semop resumed> ) = -1 EINTR (Interrupted system call) <0.998192>


I'm not sure what exactly that means, in terms of next steps. I'll dig more through the strace file and see if I can find anything else but those look to be definite bottlenecks for some reason.

At 2:24 PM -0800 2/26/08, Maurice Aubrey wrote:
What's the OS/Dist?

Red Hat Enterprise Linux ES release 3 (Taroon Update 8)

Thx,
-Joel

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux