On 6/06/2013 8:31 p.m., Nuno Fernandes wrote:
Hello,
I have squid 3.3.5 that stopped responding. I can see that it is using 100%cpu.
Cache.log reports thousands of:
2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
2013/06/06 09:19:09.000 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
2013/06/06 09:19:09.000 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
and then
2013/06/06 09:19:11.835 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
2013/06/06 09:19:12.310 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
2013/06/06 09:19:12.843 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
2013/06/06 09:19:13.254 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
2013/06/06 09:19:13.705 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
2013/06/06 09:19:14.271 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
2013/06/06 09:19:14.704 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
2013/06/06 09:19:15.032 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
# ls -la /proc/4857/fd|wc -l
770
Using 770 file descriptors
# netstat -anp|grep 4857|grep ESTAB|wc -l # established connections
682
The wierd thing is that if i don't activate ssl intercept i don't get this error. Another wierd thing is the established connections:
tcp 0 0 10.10.10.254:36046 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:36032 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:35757 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:35972 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 121 0 10.10.10.254:3126 10.10.10.254:43033 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:3126 10.10.10.254:35965 ESTABLISHED 4857/(squid-1)
tcp 121 0 10.10.10.254:3126 10.10.10.254:35696 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:36011 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:42963 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:36091 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 121 0 10.10.10.254:3126 10.10.10.254:35688 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:36098 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:36008 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 121 0 10.10.10.254:3126 10.10.10.254:35758 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:35944 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 121 0 10.10.10.254:3126 10.10.10.254:42939 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:35689 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:3126 10.10.10.254:35972 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:35700 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 121 0 10.10.10.254:3126 10.10.10.254:42995 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:35770 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:3126 10.10.10.254:35958 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:3126 10.10.10.254:35976 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:3126 10.10.10.254:35941 ESTABLISHED 4857/(squid-1)
tcp 0 0 10.10.10.254:43051 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
10.10.10.254 is the squid box. 3126 is the ssl intercept port.
# grep 3126 /etc/sysconfig/iptables
[0:0] -A PREROUTING -i vlan10 -s 10.10.10.4 -p tcp -m tcp --dport 443 -j REDIRECT --to-port 3126
Only my ip address is forwarded to 3126... Here is the sslbump part of the conf.
https_port 3126 transparent ssl-bump generate-host-certificates=on dynamic_cert_mem_cache_size=4MB cert=/etc/etproxy/ssl/myCA.pem
Funny story ...
*if* Squid were actually being "transparent proxy" here the outgoing
details on these connections would be "source 10.10.10.4:random-port,
destination some-IP:443". And your rule would loop that connection back
into Squid.
Unluckily for you "transparent" is currently an alias for "intercept"
and the Squid outgoing IP should not be 10.10.10.4. So the same
behaviour is being caused by something else more difficult to determin.
acl sslsniff src 10.10.10.4
acl sslbumpbypass dst "/etc/etproxy/whitelist.https"
acl broken_sites dstdomain .twitter.com
acl broken_sites dstdomain .facebook.com
always_direct allow sslsniff
ssl_bump none sslbumpbypass
ssl_bump none broken_sites
ssl_bump server-first all
sslcrtd_program /usr/lib/squid/ssl_crtd -s /etc/etproxy/ssl/ssl_db -M 4MB
sslcrtd_children 5
For starters check your configuration for the directive "via off" and
*remove* it. If it does not exist, please report that fact.
When that is done the broken requests should be rejected with a
forwarding loop error message and not DoS the machine while you are
testing for the source of the loop.
To track that down you can configure "debug_options ALL,1 11,2". Which
will dump the HTTP protocol headers and IP:port details for each HTTP
message through the proxy. Look for the ssl-bumped messages outbound
from Squid and see what IP:port details are on them and try to work back
from there how those details came to be.
Amos