Hello,
apologies in advance for the silly question.
We are having some stability issues with our squid farms after a recent upgrade from Centos/Squid 3.5.x to Ubuntu/Squid 5.7/6.9. I wonder if anyone here has seen something similar, and might have some suggestion about what we are obviously missing?
In short, after running for a certain period the servers run out of file descriptors. We see a slowly growing number of TCP or TCPv6 socket handles, that eventually hits the configured maximum. The handles do not get released until after squid is restarted (-k restart)
It is somewhat similar to what reported under https://access.redhat.com/solutions/3362211 . They state that
- If an application fails to
close()
it's socket descriptors and continues to allocate new sockets then it can use up all the system memory on TCP(v6) slab objects. - Note some of these sockets will not show up in
/proc/net/sockstat(6)
. Sockets that still have a file descriptor but are in theTCP_CLOSE
state will consume a slab object. But will not be accounted for in/proc/net/sockstat(6)
or "ss" or "netstat". - It can be determined whether this is an application sockets leak, by stopping the application processes that are consuming sockets. If the slab objects in
/proc/slabinfo
are freed then the application is responsible. As that means that destructor routines have found open file descriptors to sockets in the process.
"This is most likely to be a case of the application not handling error conditions correctly and not calling
close()
to free the FD and socket."For example, on a server with squid 5.7, unmodified package:
list of open files;
lsof |wc -l56963
of which 35K in TCPv6:
lsof |grep proxy |grep TCPv6 |wc -l
35301
under /proc I see less objects
cat /proc/net/tcp6 |wc -l
cat /proc/net/tcp6 |wc -l
3095
but the number of objects in the slabs is high
cat /proc/slabinfo |grep TCPv6
MPTCPv6 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0
tw_sock_TCPv6 1155 1155 248 33 2 : tunables 0 0 0 : slabdata 35 35 0
request_sock_TCPv6 0 0 304 26 2 : tunables 0 0 0 : slabdata 0 0 0
TCPv6 38519 38519 2432 13 8 : tunables 0 0 0 : slabdata 2963 2963 0
I have 35K of lines like this
lsof |grep proxy |grep TCPv6 |more
squid 1049 proxy 13u sock 0,8 0t0 5428173 protocol: TCPv6
squid 1049 proxy 14u sock 0,8 0t0 27941608 protocol: TCPv6
squid 1049 proxy 24u sock 0,8 0t0 45124047 protocol: TCPv6
squid 1049 proxy 25u sock 0,8 0t0 50689821 protocol: TCPv6
...
We thought maybe this is a weird IPv6 thing, as we only route IPv4, so we compiled a more recent version of squid with no v6 support. The thing just moved to TCP4..
lsof |wc -l
120313
cat /proc/slabinfo |grep TCP
MPTCPv6 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0
tw_sock_TCPv6 0 0 248 33 2 : tunables 0 0 0 : slabdata 0 0 0
request_sock_TCPv6 0 0 304 26 2 : tunables 0 0 0 : slabdata 0 0 0
TCPv6 208 208 2432 13 8 : tunables 0 0 0 : slabdata 16 16 0
MPTCP 0 0 1856 17 8 : tunables 0 0 0 : slabdata 0 0 0
tw_sock_TCP 5577 5577 248 33 2 : tunables 0 0 0 : slabdata 169 169 0
request_sock_TCP 1898 2002 304 26 2 : tunables 0 0 0 : slabdata 77 77 0
TCP 102452 113274 2240 14 8 : tunables 0 0 0 : slabdata 8091 8091 0
cat /proc/net/tcp |wc -l
255
lsof |grep proxy |wc -l
1221
Any suggestion? I guess it's something blatantly obvious, but it's a couple of days we look at this and we're not going anywhere...
Thanks again
_______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx https://lists.squid-cache.org/listinfo/squid-users