On 01/09/20 7:17 pm, Alex Rousskov wrote:
On 9/1/20 2:32 AM, Amish wrote:
I have frequently observed an issue with squid.service but I am not able
to detect the real cause. As mostly it works but sometimes does not.
What happens is squid starts correctly, but systemd does not seem to be
getting the notification from squid that it has started.
Sep 01 06:40:04 foo systemd[1]: Starting Squid Web Proxy Server...
Sep 01 06:41:34 foo systemd[1]: squid.service: start operation timed out. Terminating.
Sep 01 06:42:06 foo systemd[1]: squid.service: Failed with result 'timeout'.
Sep 01 06:42:06 foo systemd[1]: Failed to start Squid Web Proxy Server.
When did your Squid workers print their "Accepting ... connections at
..." message(s)? Those messages will give you the approximate time of
sd_notify() calls made by Squid workers.
Accepting ... connections at ... message came almost immediately (in 1
second).
Sep 01 06:40:05 foo squid[8446]: Set Current Directory to /var/cache/squid
Sep 01 06:40:05 foo squid[8446]: Starting Squid Cache version 4.13 for
x86_64-pc-linux-gnu...
Sep 01 06:40:05 foo squid[8446]: Service Name: squid
Sep 01 06:40:05 foo squid[8446]: Process ID 8446
Sep 01 06:40:05 foo squid[8446]: Process Roles: worker
Sep 01 06:40:05 foo squid[8446]: With 16384 file descriptors available
Sep 01 06:40:05 foo squid[8446]: Initializing IP Cache...
Sep 01 06:40:05 foo squid[8446]: DNS Socket created at [::], FD 5
Sep 01 06:40:05 foo squid[8446]: DNS Socket created at 0.0.0.0, FD 10
Sep 01 06:40:05 foo squid[8446]: Adding domain localdomain from
/etc/resolv.conf
Sep 01 06:40:05 foo squid[8446]: Adding nameserver 127.0.0.1 from
/etc/resolv.conf
Sep 01 06:40:05 foo squid[8446]: helperOpenServers: Starting 5/32
'security_file_certgen' processes
Sep 01 06:40:05 foo squid[8446]: helperOpenServers: Starting 1/1 'prrdr'
processes
Sep 01 06:40:05 foo squid[8446]: helperOpenServers: Starting 1/1 'prusr'
processes
Sep 01 06:40:05 foo squid[8446]: Logfile: opening log
daemon:/var/log/squid/access.log
Sep 01 06:40:05 foo squid[8446]: Logfile Daemon: opening log
/var/log/squid/access.log
Sep 01 06:40:05 foo squid[8446]: Local cache digest enabled;
rebuild/rewrite every 3600/3600 sec
Sep 01 06:40:05 foo squid[8446]: Store logging disabled
Sep 01 06:40:05 foo squid[8446]: Swap maxSize 0 + 262144 KB, estimated
20164 objects
Sep 01 06:40:05 foo squid[8446]: Target number of buckets: 1008
Sep 01 06:40:05 foo squid[8446]: Using 8192 Store buckets
Sep 01 06:40:05 foo squid[8446]: Max Mem size: 262144 KB
Sep 01 06:40:05 foo squid[8446]: Max Swap size: 0 KB
Sep 01 06:40:05 foo squid[8446]: Using Least Load store dir selection
Sep 01 06:40:05 foo squid[8446]: Set Current Directory to /var/cache/squid
Sep 01 06:40:05 foo squid[8446]: Finished loading MIME types and icons.
Sep 01 06:40:05 foo squid[8446]: HTCP Disabled.
Sep 01 06:40:05 foo squid[8446]: Squid plugin modules loaded: 0
Sep 01 06:40:05 foo squid[8446]: Adaptation support is on
Sep 01 06:40:05 foo squid[8446]: Accepting SSL bumped HTTP Socket
connections at local=[::]:3128 remote=[::] FD 27 flags=9
Sep 01 06:40:06 foo squid[8446]: storeLateRelease: released 0 objects
Sep 01 06:40:06 foo squid[8446]: ERROR: negotiating TLS on FD 23:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:40:06 foo squid[8446]: ERROR: negotiating TLS on FD 26:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Browser was able to access the websites after this point. Many such
lines and then
Sep 01 06:41:26 foo squid[8446]: parse URL too large (11871 bytes)
Sep 01 06:41:26 foo squid[8446]: ERROR: negotiating TLS on FD 96:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:41:27 foo squid[8446]: ERROR: negotiating TLS on FD 76:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:41:30 foo squid[8446]: Error negotiating SSL connection on FD
54: error:00000001:lib(0):func(0):reason(1) (1/-1)
Sep 01 06:41:32 foo squid[8446]: ERROR: negotiating TLS on FD 78:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:41:33 foo squid[8446]: ERROR: negotiating TLS on FD 90:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:41:33 foo squid[8446]: ERROR: negotiating TLS on FD 95:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:41:34 foo systemd[1]: squid.service: start operation timed
out. Terminating.
Sep 01 06:41:34 foo squid[8446]: Preparing for shutdown after 692 requests
Sep 01 06:41:34 foo squid[8446]: Waiting 30 seconds for active
connections to finish
Sep 01 06:41:34 foo squid[8446]: Closing HTTP(S) port [::]:3128
Sep 01 06:42:05 foo squid[8446]: Shutdown: NTLM authentication.
Sep 01 06:42:05 foo squid[8446]: Shutdown: Negotiate authentication.
Sep 01 06:42:05 foo squid[8446]: Shutdown: Digest authentication.
Sep 01 06:42:05 foo squid[8446]: Shutdown: Basic authentication.
Sep 01 06:42:06 foo squid[8446]: Shutting down...
Sep 01 06:42:06 foo squid[8446]: storeDirWriteCleanLogs: Starting...
Sep 01 06:42:06 foo squid[8446]: Finished. Wrote 0 entries.
Sep 01 06:42:06 foo squid[8446]: Took 0.00 seconds ( 0.00 entries/sec).
Sep 01 06:42:06 foo squid[8446]: Logfile: closing log
daemon:/var/log/squid/access.log
Sep 01 06:42:06 foo squid[8446]: Logfile Daemon: closing log
daemon:/var/log/squid/access.log
Sep 01 06:42:06 foo squid[8446]: Open FD READ/WRITE 11
security_file_certgen #1
Sep 01 06:42:06 foo squid[8446]: Open FD READ/WRITE 13
security_file_certgen #2
Sep 01 06:42:06 foo squid[8446]: Open FD UNSTARTED 15
security_file_certgen #3
Sep 01 06:42:06 foo squid[8446]: Open FD UNSTARTED 17
security_file_certgen #4
Sep 01 06:42:06 foo squid[8446]: Open FD UNSTARTED 19
security_file_certgen #5
Sep 01 06:42:06 foo squid[8446]: Open FD READ/WRITE 21 prrdr #1
Sep 01 06:42:06 foo squid[8446]: Open FD READ/WRITE 24 prusr #1
Sep 01 06:42:06 foo squid[8446]: Open FD UNSTARTED 25 IPC UNIX STREAM
Parent
Sep 01 06:42:06 foo squid[8446]: Squid Cache (Version 4.13): Exiting
normally.
Sep 01 06:42:06 foo squid[8444]: Squid Parent: squid-1 process 8446
exited with status 0
Sep 01 06:42:06 foo squid[8444]: Removing PID file (/run/squid.pid)
Sep 01 06:42:06 foo systemd[1]: squid.service: Failed with result 'timeout'.
Sep 01 06:42:06 foo systemd[1]: Failed to start Squid Web Proxy Server.
Not sure that parse URL line has anything to do with this bug as
sd_notify() was expected to happen long before that.
There are 3-4 other servers (all Arch linux and squid 4.13) which I
control, as of now they are working fine with Type=Notify.
Squid cache.log shows no error w.r.t. notify and seems to start correctly.
AFAICT, Squid code does not report sd_notify() failure to notify iff
that failure was due to an unset $NOTIFY_SOCKET variable. The first
sd_notify() call in Squid also unsets $NOTIFY_SOCKET variable (in the
calling worker). However, I cannot explain why that variable would be
unset during the first sd_notify() call _sometimes_.
Perhaps systemd is confused by concurrent notifications coming from
multiple workers?
Can you enable some kind of sd_notify() debugging that would show us
what the first sd_notify() call was doing and when/whether systemd
received the notification from Squid?
I have no idea how to do that. If you know then please do let me know.
I will try if I can. Server is production / live server, and without
knowing how to replicate this bug, its difficult for me to test. As I
can not have too much downtime for testing.
Thank you for your response,
Amish.
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users