Hi,
I am running squid 3.2.1 on CentOS 6.3 with kerberos authentication
using negotiate_kerberos_auth.
Generally this is working fine, but after some time more and more
helper instances stay busy and cannot finish the given request.
Therefore squid starts new helper processes to have enough working
helpers for kerberos authentication.
This release of Squid is quite old now. Are you able to upgrade your
proxy to the current stable release and see if the problem disappears?
(today that would be 3.3.8)
You can find recent versions packages for CentOS at
http://wiki.squid-cache.org/KnowledgeBase/CentOS.
I would like to update squid to version 3.3.8 but there is still
the problem that URLs that contain an ip address are not forwarded
to cache peers (see Bug 3848). Therefore it is not usable for me.
But I tested squid 3.3.8 on my test system and there was exactly the
same problem with kerberos authentication.
This is going on until squid has no more memory for the helpers:
2013/07/30 08:48:04 kid1| Starting new negotiateauthenticator helpers...
2013/07/30 08:48:04 kid1| helperOpenServers: Starting 1/500
'negotiate_kerberos_auth' processes
2013/07/30 08:48:04 kid1| ipcCreate: fork: (12) Cannot allocate memory
That is bad, it is unrelated to the helpers getting locked up though.
How much RAM is the Squid worker process using at the time this appears?
Starting helpers with fork() requires Squid to be allocated virtual
memory 2x that being used at the time by the worker process.
And how much memory is currently in use by each of those 8 BUSY helpers?
Negotiate Authenticator Statistics:
program: /usr/lib64/squid/negotiate_kerberos_auth
number active: 39 of 500 (0 shutting down)
requests sent: 11141
replies received: 11133
queue length: 0
avg service time: 4 msec
# FD PID # Requests Flags Time Offset Request
1 19 31373 753 B R 3887.019 0
1 37 31390 755 B R 3637.061 0
1 39 31391 2539 B R 2053.518 0
1 41 31392 78 B R 3859.365 0
1 43 31393 807 B R 2008.036 0
1 57 31396 415 B R 2003.899 0
1 63 31397 363 B R 1975.126 0
1 95 31401 329 B R 1944.980 0
1 29 31491 1891 0.009 0 (none)
1 77 31492 813 0.011 0 (none)
1 88 31493 578 0.009 0 (none)
The first eight helper processes are busy and will never return to
normal state until squid is restarted.
Gradually more and more helpers stay in busy state.
strace shows me that this helpers are blocked during a read-command:
....
read(0, "r", 1) = 1
read(0, "r", 1) = 1
read(0, "7", 1) = 1
read(0, "+", 1) = 1
read(0, "a", 1) = 1
read(0, "G", 1) = 1
read(0, <unfinished ...>
After this the process is never continued.
That does not look blocked to me. The value arriving is changing, just
vvvveeerrrryyyy ssslloowwwlllyyy, one byte per I/O cycle to be exact.
Since kerberos credentials can be up to 32 KB large its easy to see why
they are stuck in BUSY state for such long times.
But Microsoft says that there can be kerberos credentials up to 64 KB
(http://support.microsoft.com/kb/327825/en-us).
And I think that's the problem. Our Active Directory has actually kerberos
credentials up to 40 KB and I found out that exactly these credentials
put the kerberos helper in the busy state.
strace shows me that the helper process reads credential bytes from
squid up to 32 KB then it waits at the socket for the rest and stays busy,
but squid doesn't transmit any bytes more than 32 KB. Therefore the
helper never finishes the authentication and is no longer usable.
Users with kerberos credentials greater than 32 k cannot use squid and
cause blocked helper processes.
Is there a solution for such long kerberos credentials?
I cannot find any error messages in cache.log even if I switch on
debugging at the helper.
At this rate of I/O in the helper it is unlikely that they will be able
to send a message to cache.log in any reasonable time.
Thank you for help!
Klaus