Re: Socket handle leak?

Paolo Prinsecchi <paolo.prinx@xxxxxxxxx> · Fri, 12 Jul 2024 13:04:32 +0000 (UTC)

Thanks. We have limits set at 100K, squid can easily reach that. The problem is that the number of FD in use keeps increasing. A workaround is to restart squid every time it goes over a certain value, but it’s not really a solution. In the same situation, with centos and squid 3.5, we seldom went over 20k FD in use. 
Thanks for your reply. 

Panem et circenses

On Friday, July 12, 2024, 7:05 PM, Yvain PAYEN <yvain.payen@xxxxxxxx> wrote:

Hi, 

I my setup (also ubuntu) I have made these changes :

root@proxy: # cat /etc/security/limits.d/squid.conf 
squid        soft    nofile  64000 
squid        hard    nofile  65500 

root@proxy: # cat /etc/squid/squid.conf | grep max_file 
max_filedesc 64000 

This force the system limits for squid process and tell squid how much FD it can consume. 

Regards, 

Yvain PAYEN 

De : squid-users <squid-users-bounces@xxxxxxxxxxxxxxxxxxxxx>
De la part de paolo.prinx@xxxxxxxxx

Envoyé : vendredi 12 juillet 2024 12:58

À : squid-users@xxxxxxxxxxxxxxxxxxxxx

Objet :  Socket handle leak? 

⚠ FR : Ce message provient de l'extérieur de l'organisation.
 N'ouvrez pas de liens ou de pièces jointes à moins que vous ne sachiez que le contenu est fiable.  ⚠

Hello, 

   apologies in advance for the silly question. 

We are having some stability issues with our squid farms after a recent upgrade from Centos/Squid 3.5.x to Ubuntu/Squid 5.7/6.9. I wonder if anyone here has seen something
 similar, and might have some suggestion about what we are obviously missing? 

In short, after running for a certain period the servers run out of file descriptors. We see a slowly growing number of TCP or TCPv6 socket handles, that eventually hits
 the configured maximum. The handles do not get released until after squid is restarted (-k restart) 

It is somewhat similar to what reported under https://access.redhat.com/solutions/3362211 .
 They state that   

If an application fails to close() it's
 socket descriptors and continues to allocate new sockets then it can use up all the system memory on TCP(v6) slab objects.

Note some of these sockets will not show up in /proc/net/sockstat(6).
 Sockets that still have a file descriptor but are in the TCP_CLOSE state
 will consume a slab object. But will not be accounted for in /proc/net/sockstat(6) or
 "ss" or "netstat".

It can be determined whether this is an application sockets leak, by stopping the application processes that are consuming sockets. If the slab objects in /proc/slabinfo are
 freed then the application is responsible. As that means that destructor routines have found open file descriptors to sockets in the process. 

"This is most likely to be a case of the application not handling
 error conditions correctly and not calling close() to free the FD and
 socket." 

For example, on a server with squid 5.7, unmodified package: 

list of open files; 

lsof |wc -l 

56963 

of which 35K in TCPv6: 

lsof |grep proxy |grep TCPv6 |wc -l 

    35301 

under /proc I see less objects

    cat  /proc/net/tcp6 |wc -l 

    3095 

but the number of objects in the slabs is high 

    cat /proc/slabinfo |grep TCPv6 

    MPTCPv6                0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0 

    tw_sock_TCPv6       1155   1155    248   33    2 : tunables    0    0    0 : slabdata     35     35      0 

    request_sock_TCPv6      0      0    304   26    2 : tunables    0    0    0 : slabdata      0      0      0 

    TCPv6             
38519  38519   2432   13    8 : tunables    0    0    0 : slabdata   2963   2963      0 

I have 35K of lines like this 

    lsof |grep proxy |grep TCPv6 |more 

    squid        1049              proxy   13u     sock                0,8        0t0    5428173 protocol: TCPv6 

    squid        1049              proxy   14u     sock                0,8        0t0   27941608 protocol: TCPv6 

    squid        1049              proxy   24u     sock                0,8        0t0   45124047 protocol: TCPv6 

    squid        1049              proxy   25u     sock                0,8        0t0   50689821 protocol: TCPv6 

... 

We thought maybe this is a weird IPv6 thing, as we only route IPv4, so we compiled a more recent version of squid with no v6 support. The thing just moved to TCP4.. 

lsof |wc -l 

120313 

cat /proc/slabinfo |grep TCP 

MPTCPv6                0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0 

tw_sock_TCPv6          0      0    248   33    2 : tunables    0    0    0 : slabdata      0      0      0 

request_sock_TCPv6      0      0    304   26    2 : tunables    0    0    0 : slabdata      0      0      0 

TCPv6                208    208   2432   13    8 : tunables    0    0    0 : slabdata     16     16      0 

MPTCP                  0      0   1856   17    8 : tunables    0    0    0 : slabdata      0      0      0 

tw_sock_TCP         5577   5577    248   33    2 : tunables    0    0    0 : slabdata    169    169      0 

request_sock_TCP    1898   2002    304   26    2 : tunables    0    0    0 : slabdata     77     77      0 

TCP               102452 113274 
 2240   14    8 : tunables    0    0    0 : slabdata   8091   8091      0 

cat /proc/net/tcp |wc -l 

255 

After restarting squid the slab objects are released and the open file descriptors drop to a reasonable value. This further suggests it is squid
 hanging on to these FDs. 

lsof |grep proxy |wc -l 

1221 

Any suggestion? I guess it's something blatantly obvious, but it's a couple of days we look at this and we're not going anywhere... 

Thanks again 

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users