Search squid archive

Re: squid hangs and dies and can not be killed - needs system reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023-12-18 22:29, Amish wrote:
On 19/12/23 01:14, Alex Rousskov wrote:
On 2023-12-18 09:35, Amish wrote:

I use Arch Linux and today I updated squid from squid 5.7 to squid 6.6.

> Dec 18 13:01:24 mumbai squid[604]: kick abandoning conn199

I do not know whether the above problem is the primary problem in your setup, but it is a red flag. Transactions on the same connection may get stuck after that message; it is essentially a Squid bug.

I am not sure at all, but this bug might be related to Bug 5187 workaround that went into Squid v6.2 (commit c44cfe7): https://bugs.squid-cache.org/show_bug.cgi?id=5187

Does Squid accept new TCP connections after it enters what you describe as a dead state? For example, does "telnet 127.0.0.1 8080" establishes a connection if executed on the same machine as Squid?

Yes it establishes connection. But I do not know what to do next.

This tells us that your Squid is still listening for incoming connections. Most likely, it is not "dead" but running and just unable to make progress with those connections (for yet unknown reasons). That information is helpful but not sufficient (for me) to solve the problem you are describing.

The next step that I would recommend is to collect debugging information from the running process and share a pointer to the corresponding compressed cache.log file:

* Ideally, start collection when Squid starts and reproduce the problem while collecting full debugging information:
http://wiki.squid-cache.org/SquidFaq/BugReporting#full-debug-output

* If you have to, start collection after Squid is already in bad state and just before you use telnet or browser to tickle Squid:
http://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction

Do not use any secret information (e.g., production certificate keys) for these tests (unless you are going to share the logs privately with those you trust).

Do not downgrade to v5 for these tests.


HTH,

Alex.


Browser showed "Connection timed out" message. But I believe browser's also connected but nothing happened afterwards.


> kill -9 does nothing

Is it possible that you are trying to kill the wrong process? You should be killing this process AFAICT:

> root         601  0.0  0.2  73816 22528 ?        Ss   12:59 0:02
> /usr/bin/squid -f /etc/squid/btnet/squid.btnet.conf --foreground -sYC

I did not clarify but all processes needed SIGKILL and vanished except the Dead squid process which still remained.

# systemctl stop squid

Dec 19 08:46:38 mumbai systemd[1]: squid.service: State 'stop-sigterm' timed out. Killing. Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 601 (squid) with signal SIGKILL. Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 604 (squid) with signal SIGKILL. Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 607 (security_file_c) with signal SIGKILL. Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 608 (security_file_c) with signal SIGKILL. Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 609 (security_file_c) with signal SIGKILL. Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 610 (security_file_c) with signal SIGKILL. Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 611 (security_file_c) with signal SIGKILL. Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 622 (log_file_daemon) with signal SIGKILL. Dec 19 08:46:38 mumbai systemd[1]: squid.service: Main process exited, code=killed, status=9/KILL Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 604 (squid) with signal SIGKILL.

Waited for 2 minutes for squid to stop then pressed ctrl-c to systemctl stop squid command.

As you can see in last line shows that attempt was made to kill DEAD process with PID 604.

# ps aux |grep squid
proxy        604  0.0  0.0      0     0 ?        D    Dec18   0:03 [squid]

Now only DEAD squid process remains.

What next? Should I downgrade to 5.9 and check?

Regards

Amish

Alex.


After the update from 5.7 to 6.6, squid starts but then reaches Dead state in a minute or two.

# ps aux | grep squid
root         601  0.0  0.2  73816 22528 ?        Ss   12:59 0:02 /usr/bin/squid -f /etc/squid/btnet/squid.btnet.conf --foreground -sYC
proxy        604  0.0  0.0      0     0 ?        D    12:59 0:03 [squid]
proxy        607  0.0  0.0  11976  7424 ?        S    12:59 0:00 (security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB proxy        608  0.0  0.0  11976  7168 ?        S    12:59 0:00 (security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB proxy        609  0.0  0.0  11712  5632 ?        S    12:59 0:00 (security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB proxy        610  0.0  0.0  11712  5376 ?        S    12:59 0:00 (security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB proxy        611  0.0  0.0  11712  5504 ?        S    12:59 0:00 (security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB proxy        622  0.0  0.0   6116  3200 ?        S    12:59 0:00 (logfile-daemon) /var/log/squid/access.log

And then all requests get stuck. Notice the D (dead) state of squid.

I use multiple ports for multiple purposes. (It all worked fine in squid 5.7)

Dec 18 12:59:10 mumbai squid[601]: Starting Authentication on port [::]:3128 Dec 18 12:59:10 mumbai squid[601]: Disabling Authentication on port [::]:3128 (interception enabled) Dec 18 12:59:10 mumbai squid[601]: Starting Authentication on port [::]:8081 Dec 18 12:59:10 mumbai squid[601]: Disabling Authentication on port [::]:8081 (interception enabled) Dec 18 12:59:12 mumbai squid[601]: Starting Authentication on port [::]:8082 Dec 18 12:59:12 mumbai squid[601]: Disabling Authentication on port [::]:8082 (interception enabled) Dec 18 12:59:12 mumbai squid[601]: Starting Authentication on port [::]:8083 Dec 18 12:59:12 mumbai squid[601]: Disabling Authentication on port [::]:8083 (interception enabled) Dec 18 12:59:13 mumbai squid[601]: Starting Authentication on port [::]:8084 Dec 18 12:59:13 mumbai squid[601]: Disabling Authentication on port [::]:8084 (interception enabled) Dec 18 12:59:13 mumbai squid[601]: Starting Authentication on port [::]:3136 Dec 18 12:59:13 mumbai squid[601]: Disabling Authentication on port [::]:3136 (interception enabled) Dec 18 12:59:13 mumbai squid[601]: Starting Authentication on port [::]:3137 Dec 18 12:59:13 mumbai squid[601]: Disabling Authentication on port [::]:3137 (interception enabled)
...
Dec 18 12:59:29 mumbai squid[604]: Adaptation support is on
Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted HTTP Socket connections at conn19 local=[::]:3128 remote=[::] FD 27 flags=41
                                        listening port: 3128
Dec 18 12:59:29 mumbai squid[604]: Accepting SSL bumped HTTP Socket connections at conn21 local=[::]:8080 remote=[::] FD 28 flags=9
                                        listening port: 8080
Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted SSL bumped HTTPS Socket connections at conn23 local=[::]:8081 remote=[::] FD 29 flags=41
                                        listening port: 8081
Dec 18 12:59:29 mumbai squid[604]: Accepting SSL bumped HTTP Socket connections at conn25 local=[::]:8092 remote=[::] FD 30 flags=9
                                        listening port: 8092
Dec 18 12:59:29 mumbai systemd[1]: Started Squid Web Proxy Server.
Dec 18 12:59:29 mumbai squid[604]: Accepting SSL bumped HTTP Socket connections at conn27 local=[::]:8093 remote=[::] FD 31 flags=9
                                        listening port: 8093
Dec 18 12:59:29 mumbai squid[604]: Accepting SSL bumped HTTP Socket connections at conn29 local=[::]:8094 remote=[::] FD 32 flags=9
                                        listening port: 8094
Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted SSL bumped HTTPS Socket connections at conn31 local=[::]:8082 remote=[::] FD 33 flags=41
                                        listening port: 8082
Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted SSL bumped HTTPS Socket connections at conn33 local=[::]:8083 remote=[::] FD 34 flags=41
                                        listening port: 8083
Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted SSL bumped HTTPS Socket connections at conn35 local=[::]:8084 remote=[::] FD 35 flags=41
                                        listening port: 8084
Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted HTTP Socket connections at conn37 local=[::]:3136 remote=[::] FD 36 flags=41
                                        listening port: 3136
Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted HTTP Socket connections at conn39 local=[::]:3137 remote=[::] FD 37 flags=41
                                        listening port: 3137

And then following errors came:


Dec 18 12:59:45 mumbai squid[604]: ERROR: failure while accepting a TLS connection on conn41 local=192.168.0.1:8080 remote=192.168.0.111:53867 FD 12 flags=1: SQUID_TLS
_ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
                                        current master transaction: master53 Dec 18 12:59:45 mumbai squid[604]: ERROR: failure while accepting a TLS connection on conn42 local=192.168.0.1:8080 remote=192.168.0.111:53868 FD 14 flags=1: SQUID_TLS
_ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
                                        current master transaction: master53 Dec 18 12:59:45 mumbai squid[604]: ERROR: failure while accepting a TLS connection on conn43 local=192.168.0.1:8080 remote=192.168.0.111:53869 FD 16 flags=1: SQUID_TLS
_ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
                                        current master transaction: master57 Dec 18 12:59:45 mumbai squid[604]: ERROR: failure while accepting a TLS connection on conn44 local=192.168.0.1:8080 remote=192.168.0.111:53870 FD 12 flags=1: SQUID_TLS
_ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
                                        current master transaction: master57 Dec 18 12:59:56 mumbai squid[604]: ERROR: failure while accepting a TLS connection on conn62 local=192.168.0.1:8080 remote=192.168.0.111:53887 FD 12 flags=1: SQUID_TLS
_ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
                                        current master transaction: master95 Dec 18 12:59:59 mumbai squid[604]: ERROR: failure while accepting a TLS connection on conn64 local=192.168.0.1:8080 remote=192.168.0.111:53888 FD 12 flags=1: SQUID_TLS
_ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
                                        current master transaction: master99 Dec 18 13:00:02 mumbai squid[604]: ERROR: failure while accepting a TLS connection on conn65 local=192.168.0.1:8080 remote=192.168.0.178:56115 FD 12 flags=1: SQUID_TLS
_ERR_ACCEPT+TLS_LIB_ERR=A000418+TLS_IO_ERR=1
                                        current master transaction: master53 Dec 18 13:01:24 mumbai squid[604]: kick abandoning conn199 local=192.168.0.1:8093 remote=192.168.0.101:52211 FD 52 flags=1                                         connection: conn199 local=192.168.0.1:8093 remote=192.168.0.101:52211 FD 52 flags=1 Dec 18 13:01:45 mumbai squid[604]: ERROR: failure while accepting a TLS connection on conn240 local=192.168.0.1:8093 remote=192.168.0.111:53931 FD 48 flags=1: SQUID_TL
S_ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
                                        current master transaction: master314


After this point there is nothing in systemd journal (via: journalctl -f -u squid) and same lines are in cache.log.

Squid got stuck (DEAD state) at 13:01 and right now it 19:26 (6 hours passed) and squid is still in dead state.

kill -9 or kill -ALRM or -HUP also does nothing.

So to restart squid - I will need to restart whole system.

I have sslbump directives but it is not really applied.

#NOTE: nosslbump_ips below contains 192.168.0.0/24 (whole LAN) so effectively there is no SSL bump after step1.

acl nosslbump_ips src 192.168.0.0/24
ssl_bump splice ssl_step1 nosslbump_ips
ssl_bump peek ssl_step1
ssl_bump splice nosslbump_domains
ssl_bump stare sslbump_domains
ssl_bump splice ssl_step2
ssl_bump bump all


Any idea? If anything changed from 5.7 to 6.6 that may cause this behaviour?

Looking at changelog:

Bug 5256: Intercepting port fails to accept
https://bugs.squid-cache.org/show_bug.cgi?id=5256

Bug 5154: Do not open IPv6 sockets when IPv6 is disabled
https://bugs.squid-cache.org/show_bug.cgi?id=5154

Not sure if above two bug FIXES (in between v5.7 to v6.6) are related to my issue.

I ran netstat:

# netstat -ntlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address State       PID/Program name
...
tcp6      33      0 :::3137 :::*                    LISTEN -
tcp6       0      0 :::3136 :::*                    LISTEN -
tcp6       4      0 :::3128 :::*                    LISTEN -
tcp6       0      0 :::8081 :::*                    LISTEN -
tcp6       0      0 :::8080 :::*                    LISTEN -
tcp6       0      0 :::8083 :::*                    LISTEN -
tcp6       0      0 :::8082 :::*                    LISTEN -
tcp6       0      0 :::8084 :::*                    LISTEN -
tcp6    4097      0 :::8093 :::*                    LISTEN -
tcp6       0      0 :::8092 :::*                    LISTEN -
tcp6       0      0 :::8094 :::*                    LISTEN -
...

I do not have IPv6 enabled, yet there are 33 and 4097 numbers in Recv-Q and also no process/PID owns these ports.

Same IPv4 ports are not shown in use by netstat, so only IPv6 ports remain open, that too orphaned!

So what is happening?

Any idea to solve or any workaround?

Thank you,

Amish.
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux