On 11/11/2011 8:04 p.m., ftiaronsem wrote:
On 11/10/2011 03:27 AM, Amos Jeffries wrote:
On Wed, 09 Nov 2011 23:54:12 +0100, ftiaronsem wrote:
Hello alltogether
This one gives me a headache. I joined my ubuntu 10.04 LTS server
running squid 2.7.STABLE7 and samba 3.4.7 to my windows 2008 domain
without problems.
Squid also started fine using
/usr/bin/ntlm_auth --helper-protocol=squid-2.5-ntlmssp
/usr/lib/squid/wbinfo_group.pl
for authentication. However after some while, some users get DENIED
messages. A few hours after that, squid crashes completly complaining:
2011/11/08 15:22:56| WARNING: up to 50 pending requests queued
2011/11/08 15:22:56| Consider increasing the number of
ntlmauthenticator processes to at least 60 in your config file.
FATAL: Too many queued ntlmauthenticator requests (51 on 10)
Read that message again.
Your Squid is dying if it has to handle 51 or more parallel TCP
connections being opened during the time period taken to do NTLM
handshake.
One client browser will open at least 8 connections for most popular
websites.
Winbind logs show up a lot of stuff like
[2011/11/08 15:19:06, 0]
winbindd/winbindd_dual.c:186(async_request_timeout_handler)
async_request_timeout_handler: child pid 25224 is not responding.
Closing connection to it.
[2011/11/08 15:19:06, 1] winbindd/winbindd_util.c:303(trustdom_recv)
Could not receive trustdoms
So i am tempted to conclude that this is a samba/winbind problem.
However I am often getting similar errors in the winbind logs at other
sites, which run smoothly.
It does seem to be problems in winbind. Regardless of whether it gets
bad enough to break Squid or not.
These will be making that handshake time period a longer. With that 50
limit getting closer every second of it.
Do you have similar warnings in your error logs? Judgig by your
experience, what would you think is the most likely fix? Upgrading
samba?
Lookup what those winbind errors are about first. It may be config
changes or other software upgrades needed as well.
This might be it:
http://lists.samba.org/archive/samba-technical/2008-June/059504.html
Amos
Thanks for your answer
I will have a try in resolving these winbind errors. Hopefully I'll
find something on the net.
Hitting the ntmlauthenticator limit seems not that likely, since I got
the first warning two minutes before
I was not guessing. That log WARNING only occurs when the helper load
capacity is passed, the FATAL only occurs when the queue limit is hit in
a period of overload.
Traffic spikes come in all sizes and durations. 2 minutes is not a very
long one.
2011/11/08 15:20:38| WARNING: All ntlmauthenticator processes are busy.
2011/11/08 15:20:38| WARNING: up to 10 pending requests queued
overload. (capacity + 10 connections)
2011/11/08 15:21:10| WARNING: All ntlmauthenticator processes are busy.
2011/11/08 15:21:10| WARNING: up to 26 pending requests queued
2011/11/08 15:21:10| Consider increasing the number of
ntlmauthenticator processes to at least 36 in your config file.
more overload. (capacity + 16 connections + earlier queue of 10)
16>10. The traffic load is increasing even further past the rate where
overload was hit.
2011/11/08 15:21:41| WARNING: All ntlmauthenticator processes are busy.
2011/11/08 15:21:41| WARNING: up to 38 pending requests queued
2011/11/08 15:21:41| Consider increasing the number of
ntlmauthenticator processes to at least 48 in your config file.
even more overload. (capacity + 12 connections + earlier queue of 26)
12<16. traffic is starting to reduce, but is still well above overload rate.
2011/11/08 15:22:12| WARNING: All ntlmauthenticator processes are busy.
2011/11/08 15:22:12| WARNING: up to 46 pending requests queued
2011/11/08 15:22:12| Consider increasing the number of
ntlmauthenticator processes to at least 56 in your config file.
even more overload. (capacity + 8 connections + earlier queue of 38)
8<12. traffic is reducing more, but slowly, and still well above
overload rate. The queue is getting very long...
2011/11/08 15:22:56| WARNING: All ntlmauthenticator processes are busy.
2011/11/08 15:22:56| WARNING: up to 50 pending requests queued
2011/11/08 15:22:56| Consider increasing the number of
ntlmauthenticator processes to at least 60 in your config file.
Queue limit exceeded. Crash.
4<8. the traffic rate is still in overload. But almost dropped back
below the point where helpers can start to catch up on the backlog.
Given another minute the queue might be cleared again. Too bad the
absolute maximum limit was hit already.
The solution is to raise the number of helper children. Each helper
child contributes some req/sec amount to the "capacity" number.
Amos