Search squid archive

Socket handle leak?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,
   apologies in advance for the silly question.

We are having some stability issues with our squid farms after a recent upgrade from Centos/Squid 3.5.x to Ubuntu/Squid 5.7/6.9. I wonder if anyone here has seen something similar, and might have some suggestion about what we are obviously missing?


In short, after running for a certain period the servers run out of file descriptors. We see a slowly growing number of TCP or TCPv6 socket handles, that eventually hits the configured maximum. The handles do not get released until after squid is restarted (-k restart)


It is somewhat similar to what reported under https://access.redhat.com/solutions/3362211 . They state that  
  • If an application fails to close() it's socket descriptors and continues to allocate new sockets then it can use up all the system memory on TCP(v6) slab objects.
  • Note some of these sockets will not show up in /proc/net/sockstat(6). Sockets that still have a file descriptor but are in the TCP_CLOSE state will consume a slab object. But will not be accounted for in /proc/net/sockstat(6) or "ss" or "netstat".
  • It can be determined whether this is an application sockets leak, by stopping the application processes that are consuming sockets. If the slab objects in /proc/slabinfo are freed then the application is responsible. As that means that destructor routines have found open file descriptors to sockets in the process.

"This is most likely to be a case of the application not handling error conditions correctly and not calling close() to free the FD and socket."


For example, on a server with squid 5.7, unmodified package:

list of open files;
lsof |wc -l
56963

of which 35K in TCPv6:
lsof |grep proxy |grep TCPv6 |wc -l
    35301

under /proc I see less objects
    cat  /proc/net/tcp6 |wc -l
    3095

but the number of objects in the slabs is high
    cat /proc/slabinfo |grep TCPv6
    MPTCPv6                0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
    tw_sock_TCPv6       1155   1155    248   33    2 : tunables    0    0    0 : slabdata     35     35      0
    request_sock_TCPv6      0      0    304   26    2 : tunables    0    0    0 : slabdata      0      0      0
    TCPv6              38519  38519   2432   13    8 : tunables    0    0    0 : slabdata   2963   2963      0

I have 35K of lines like this
    lsof |grep proxy |grep TCPv6 |more
    squid        1049              proxy   13u     sock                0,8        0t0    5428173 protocol: TCPv6
    squid        1049              proxy   14u     sock                0,8        0t0   27941608 protocol: TCPv6
    squid        1049              proxy   24u     sock                0,8        0t0   45124047 protocol: TCPv6
    squid        1049              proxy   25u     sock                0,8        0t0   50689821 protocol: TCPv6
...


We thought maybe this is a weird IPv6 thing, as we only route IPv4, so we compiled a more recent version of squid with no v6 support. The thing just moved to TCP4..

lsof |wc -l
120313

cat /proc/slabinfo |grep TCP
MPTCPv6                0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
tw_sock_TCPv6          0      0    248   33    2 : tunables    0    0    0 : slabdata      0      0      0
request_sock_TCPv6      0      0    304   26    2 : tunables    0    0    0 : slabdata      0      0      0
TCPv6                208    208   2432   13    8 : tunables    0    0    0 : slabdata     16     16      0
MPTCP                  0      0   1856   17    8 : tunables    0    0    0 : slabdata      0      0      0
tw_sock_TCP         5577   5577    248   33    2 : tunables    0    0    0 : slabdata    169    169      0
request_sock_TCP    1898   2002    304   26    2 : tunables    0    0    0 : slabdata     77     77      0
TCP               102452 113274   2240   14    8 : tunables    0    0    0 : slabdata   8091   8091      0


cat /proc/net/tcp |wc -l
255

After restarting squid the slab objects are released and the open file descriptors drop to a reasonable value. This further suggests it is squid hanging on to these FDs.

lsof |grep proxy |wc -l
1221


Any suggestion? I guess it's something blatantly obvious, but it's a couple of days we look at this and we're not going anywhere...

Thanks again


_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users

[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux