Re: Hung thread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yes, Each hung thread shows the exact same back trace. And each was spawned by a request to the same CGI script but with differing arguments.
There is an LDAP login requirement for this tool. Not sure if that is interesting as many other tools on this same server require LDAP authentication as well.


MJ



On Thursday, June 18, 2015 7:56 AM, Jeff Trawick <trawick@xxxxxxxxx> wrote:


On Wed, Jun 17, 2015 at 8:51 PM, Mark Jacquet <mark_jacquet@xxxxxxxxx.invalid> wrote:
Just another oddity to add to the issue.

Overnight several more hung threads appeared and the load on the system had jumped into the mid 20's.
After killing these the load did not drop. Looking at the list of running processes I found httpd's running,spawned from the original root httpd process that *were not even displayed* in the scoreboard!!  After killing these hidden zombies off the load dropped again.

What's common about the processes?  Similar backtrace to the first one posted?


 

So now I have to catch and kill two types: Zombies on the scoreboard and hidden zombies.

And this is cute. Some times the zombies hang around so long that when the system gets back to creating a new process for slot #1, if the zombie was originally in that slot it is displayed their along with it's brothers for the new process:


"scoreboard squatting"


e.g. Note process 19597 below

1-0166310/33/1320_ 131.22202255280.01.6035.79 10.172.91.217newyahoo.oak.sap.corp:80NULL 1-0166310/18/1087_ 105.88340736980.00.6926.65 10.172.240.113www-dse.oak.sap.corp:80GET /cgi-bin/websql/websql.dir/QTS/bugsheetcont.hts?bugid=74133 1-0166310/11/1178_ 76.49589542980.00.5634.78 10.172.91.92newyahoo.oak.sap.corp:80NULL 1-0166310/32/1295_ 92.17425417130.04.0342.07 10.172.240.113newyahoo.oak.sap.corp:80NULL 1-0195970/26/1319W 35.552441700.00.5437.10 10.172.248.87www-rev.oak.sap.corp:80GET /cgi-bin/rev.cgi?action="" HTTP/1.1 1-0166310/12/1427_ 18.41794100.00.14238.52 10.172.240.113newyahoo.oak.sap.corp:80NULL 1-0166310/27/1442_ 30.67719695430.00.7835.07 10.172.85.9newyahoo.oak.sap.corp:80NULL 1-0166310/19/784_ 10.70940630.00.4520.95 10.172.246.203newyahoo.oak.sap.corp:80NULL 1-0166310/8/1034_ 2.86103144630.00.0124.04 10.172.90.155newyahoo.oak.sap.corp:80NULL 2-0-0/0/99. 58.943145013820.00.002.15 10.136.66.135newyahoo.oak.sap.corp:80NULL 2-0-0/0/82. 2181.923144824390.00.001.48 10.162.65.165www-dse.oak.sap.corp:80POST /cgi-bin/websql/websql.dir/QTS/bugsescalated.pl?product=AN 2-0-0/0/162. 2027.12314509350.00.003.36 10.50.3.99newyahoo.oak.sap.corp:80NULL 2-0-0/0/576. 1704.40314504100.00.0013.38 10.172.240.113newyahoo.oak.sap.corp:80NULL 2-0-0/0/928. 1295.363145029750.00.0024.38 10.50.17.221newyahoo.oak.sap.corp:80NULL 2-0-0/0/852. 1798.52314503810.00.0020.72 10.162.65.165newyahoo.oak.sap.corp:80NULL 2-0-0/0/1084. 551.293145022210.00.0026.52 10.176.138.162newyahoo.oak.sap.corp:80NULL 2-0-0/0/1180. 385.833145019630.00.0034.31 10.162.65.197newyahoo.oak.sap.corp:80NULL 2-0-0/0/50. 50.713145000.00.001.62 10.58.181.166www-rev.oak.sap.corp:80GET /cgi-bin/rev.cgi?action="" HTTP/1.1 2-0137610/12/1078W 58.803489600.00.1031.67 10.172.107.38www-rev.oak.sap.corp:80POST /cgi-bin/rev.cgi HTTP/1.1 2-0-0/0/1075. 1061.5331450790.00.0031.65 10.172.90.88newyahoo.oak.sap.corp:80GET /server-status HTTP/1.1 2-0-0/0/1362. 46.803145080.00.0039.72 10.172.107.38www-rev.oak.sap.corp:80POST /cgi-bin/rev.cgi HTTP/1.1 2-0-0/0/1142. 56.693145011490.00.0035.22 10.172.240.113newyahoo.oak.sap.corp:80NUL
Slot #2 currently not being used (still has zombie)

MJ




Mj




On Tuesday, June 16, 2015 5:42 PM, Mark Jacquet <mark_jacquet@xxxxxxxxx.INVALID> wrote:


Upgrade as in Apache upgrade or Solaris 5.10 patch upgrad? :)

Apache is all new of course 2.4.12 with the latest add on sources (apr, pcre, etc)
The bad news is the OS is not at all up to date. And for reasons I have no control over, I cannot patch.
So if this is an OS issue then ......

I seem to be running with the Sun Native LDAP SDK. Would building against  different LDAP source help? (Open LDAP)?

Long term plan -> moving all Apache servers to Linux

Mj



On Tuesday, June 16, 2015 5:31 PM, Eric Covener <covener@xxxxxxxxx> wrote:


On Tue, Jun 16, 2015 at 8:23 PM, Mark Jacquet

<mark_jacquet@xxxxxxxxx.invalid> wrote:
> So do you think this hang is related to the native LDAP lib code?


It is possible but IMO not very likely. It has to corrutp memory just
enough to put a looping structure in apr_rmm.  What's your upgrade
history like?

--
Eric Covener
covener@xxxxxxxxx

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx









--
Born in Roswell... married an alien...
http://emptyhammock.com/




[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux