the reads have many errors, about 25% and your other email shows the error code EAGAIN.
All calls to connect() fail and all calls to recfrom() fail.
Its seems to me that the system has a resource problem.
Running out of file descriptor or other system resources.
I suggest to look at /var/log/messages and strace again and
find out why connect, recvfrom and accept fails.
Marcus
karj wrote:
The output of the stace at a normal time
Process 18021 attached - interrupt to quit
^CProcess 18021 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
40.42 0.078685 1 131179 16 write
23.38 0.045513 0 125618 32228 read
11.07 0.021552 0 90238 epoll_ctl
6.81 0.013266 0 64968 fcntl
4.64 0.009038 1 12177 open
3.95 0.007696 0 22260 close
3.92 0.007640 0 24285 stat
1.82 0.003541 0 19901 lseek
1.82 0.003536 0 12021 2105 accept
0.98 0.001913 0 4169 epoll_wait
0.74 0.001440 0 9916 getsockname
0.27 0.000533 0 8244 fstat
0.14 0.000265 1 245 245 recvfrom
0.02 0.000046 0 240 240 connect
0.00 0.000000 0 12 brk
0.00 0.000000 0 240 socket
0.00 0.000000 0 3 sendto
0.00 0.000000 0 240 bind
0.00 0.000000 0 240 setsockopt
0.00 0.000000 0 239 getsockopt
0.00 0.000000 0 1 getrusage
0.00 0.000000 0 15 getdents64
------ ----------- ----------- --------- --------- ----------------
100.00 0.194664 526451 34834 total
Thanks again
-----Original Message-----
From: karj [mailto:gkaragiannidis@xxxxxxxxx]
Sent: Τετάρτη, 29 Φεβρουαρίου 2012 3:40 μμ
To: 'Sebastian Muniz'; squid-users@xxxxxxxxxxxxxxx
Subject: RE: squid "stops working" several times a day
I 'm able to ping the machines
The one thing that I observed is that
by the time of crisis squid process is using 100% of the CPU.
That's happening to every server which has the problem...
I 've tried to use strace but I've got no success since the strace output is
huge.
What else can I do to identify the problem.?
At the time of problem seems from cache.log that squid loses connectivity
with almost everybidy
2012/02/29 09:15:51| TCP connection to assets_servers (xxx.xxx.xxx.xxx:80)
failed
2012/02/29 09:15:51| TCP connection to typos_servers (xxx.xxx.xxx.xxx:80)
failed
2012/02/29 09:15:51| Detected DEAD Parent: tityros_servers
2012/02/29 09:15:51| TCP connection to assets_servers (xxx.xxx.xxx.xxx:80)
failed
2012/02/29 09:15:51| TCP connection to tityros_servers (xxx.xxx.xxx.xxx:80)
failed
2012/02/29 09:15:51| TCP connection to typos_servers (xxx.xxx.xxx.xxx:80)
failed
From another sibling log at the same time
2012/02/29 09:15:51| Detected DEAD Sibling: xxx.xxx.xxx.xxx
Thanks in advance
Yiannis
-----Original Message-----
From: Sebastian Muniz [mailto:basurerosebita@xxxxxxxxx]
Sent: Τρίτη, 28 Φεβρουαρίου 2012 11:52 μμ
To: squid-users@xxxxxxxxxxxxxxx
Subject: Re: squid "stops working" several times a day
On 2/28/2012 2:54 PM, karj wrote:
Hi All,
I have a problem with my squid's.
Squid "stops working" several times a day.
The only thing that warns me that something is wrong in cache.log is
the "Detected DEAD Sibling: xxx.xx.xxx.xxx" message.
After a few seconds everything goes back to normal.
We are using 5 squids version (2.7.Stable 9) in Accelerator Mode which
are sibling to each other.
So we have 5 sibling squid in front of our web farms. Serving almost
7000/request per second at peak time, and an average of 4500/request
per second.
The problem occurs randomly in all servers...
Are you able to reach (telnet or ping or anything) the sibling during the
times that squid stops working?
What can you tell about the sibling logs? Specially the cache.log
Regards
Sebastian