Thanks for your reply.
On Friday, July 12, 2024, 7:05 PM, Yvain PAYEN <yvain.payen@xxxxxxxx> wrote:
Hi,
I my setup (also ubuntu) I have made these changes :
root@proxy: # cat /etc/security/limits.d/squid.conf
squid soft nofile 64000
squid hard nofile 65500
root@proxy: # cat /etc/squid/squid.conf | grep max_file
max_filedesc 64000
This force the system limits for squid process and tell squid how much FD it can consume.
Regards,
Yvain PAYEN
De : squid-users <squid-users-bounces@xxxxxxxxxxxxxxxxxxxxx> De la part de paolo.prinx@xxxxxxxxx
Envoyé : vendredi 12 juillet 2024 12:58
À : squid-users@xxxxxxxxxxxxxxxxxxxxx
Objet : Socket handle leak?
⚠ FR : Ce message provient de l'extérieur de l'organisation. N'ouvrez pas de liens ou de pièces jointes à moins que vous ne sachiez que le contenu est fiable. ⚠
Hello,
apologies in advance for the silly question.
We are having some stability issues with our squid farms after a recent upgrade from Centos/Squid 3.5.x to Ubuntu/Squid 5.7/6.9. I wonder if anyone here has seen something similar, and might have some suggestion about what we are obviously missing?
In short, after running for a certain period the servers run out of file descriptors. We see a slowly growing number of TCP or TCPv6 socket handles, that eventually hits the configured maximum. The handles do not get released until after squid is restarted (-k restart)
It is somewhat similar to what reported under https://access.redhat.com/solutions/3362211 . They state that
- If an application fails to
close()
it's socket descriptors and continues to allocate new sockets then it can use up all the system memory on TCP(v6) slab objects.- Note some of these sockets will not show up in
/proc/net/sockstat(6)
. Sockets that still have a file descriptor but are in theTCP_CLOSE
state will consume a slab object. But will not be accounted for in/proc/net/sockstat(6)
or "ss" or "netstat".- It can be determined whether this is an application sockets leak, by stopping the application processes that are consuming sockets. If the slab objects in
/proc/slabinfo
are freed then the application is responsible. As that means that destructor routines have found open file descriptors to sockets in the process.
"This is most likely to be a case of the application not handling error conditions correctly and not calling
close()
to free the FD and socket."
For example, on a server with squid 5.7, unmodified package:
list of open files;
lsof |wc -l
56963
of which 35K in TCPv6:
lsof |grep proxy |grep TCPv6 |wc -l
35301
under /proc I see less objects
cat /proc/net/tcp6 |wc -l3095
but the number of objects in the slabs is high
cat /proc/slabinfo |grep TCPv6
MPTCPv6 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0
tw_sock_TCPv6 1155 1155 248 33 2 : tunables 0 0 0 : slabdata 35 35 0
request_sock_TCPv6 0 0 304 26 2 : tunables 0 0 0 : slabdata 0 0 0
TCPv6 38519 38519 2432 13 8 : tunables 0 0 0 : slabdata 2963 2963 0
I have 35K of lines like this
lsof |grep proxy |grep TCPv6 |more
squid 1049 proxy 13u sock 0,8 0t0 5428173 protocol: TCPv6
squid 1049 proxy 14u sock 0,8 0t0 27941608 protocol: TCPv6
squid 1049 proxy 24u sock 0,8 0t0 45124047 protocol: TCPv6
squid 1049 proxy 25u sock 0,8 0t0 50689821 protocol: TCPv6
...
We thought maybe this is a weird IPv6 thing, as we only route IPv4, so we compiled a more recent version of squid with no v6 support. The thing just moved to TCP4..
lsof |wc -l
120313
cat /proc/slabinfo |grep TCP
MPTCPv6 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0
tw_sock_TCPv6 0 0 248 33 2 : tunables 0 0 0 : slabdata 0 0 0
request_sock_TCPv6 0 0 304 26 2 : tunables 0 0 0 : slabdata 0 0 0
TCPv6 208 208 2432 13 8 : tunables 0 0 0 : slabdata 16 16 0
MPTCP 0 0 1856 17 8 : tunables 0 0 0 : slabdata 0 0 0
tw_sock_TCP 5577 5577 248 33 2 : tunables 0 0 0 : slabdata 169 169 0
request_sock_TCP 1898 2002 304 26 2 : tunables 0 0 0 : slabdata 77 77 0
TCP 102452 113274 2240 14 8 : tunables 0 0 0 : slabdata 8091 8091 0
cat /proc/net/tcp |wc -l
255
After restarting squid the slab objects are released and the open file descriptors drop to a reasonable value. This further suggests it is squid hanging on to these FDs.
lsof |grep proxy |wc -l
1221
Any suggestion? I guess it's something blatantly obvious, but it's a couple of days we look at this and we're not going anywhere...
Thanks again
_______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx https://lists.squid-cache.org/listinfo/squid-users