On 17.04.2012 09:03, Chris Waters wrote:
I am working some authentication/performance issues with Squid 3.1.19
on
Ubuntu with NTLM via winbind and a filtering upstream service (proxy
chaining). Is there a method for turning on more verbose logging that
either shows the object the authentication parts are processing on or
interleaving the access log and debug log so that its apparent which
authentication request belongs to which object? What I am trying to
determine is if there are authentication issues or if it's the
upstream
causing the latency.
You can use the auth_user_request/AuthUserRequest hex IDs to determine
creation (start) and result (finish) lookup timings. Those objects store
the handshake and validation state info and are thus the best indicator
available of what is going on.
I am not specifically talking about turning specific logging sections
on
(like 28,9 29,9 for example), but a better method of matching what's
being
requested and what's being done to authenticate the request.
NTLM does not authenticate *requests*. Please be clear on that.
It hijacks and violates the HTTP layer headers to authenticate the TCP
connection layer packets. Then assumes (incorrectly) that all requests
sent over the TCP connection are from the same end-user. In order to
cope with that bad assumption Squid is forced to disable all HTTP
multiplexing abilities on connections sighted using NTLM or Negotiate
auth (we call this "pining").
* This pinning is #3 major cause of latency issues with NTLM in proxy
chaining. Since it locks two socket connections per end-user at each hop
along the HTTP chain (creating a limit of ~64K users per proxy listening
IP).
As a side effect; other than the first-request rejections there is
nothing to correlate access.log records with NTLM authentication. You
need to create a custom log format with both %>a and %>p to get the
client IP:port details about the TCP connection and see what requests
are taking place on it. Then map those requests to a particular state
order for the credentials contained within, then to the AuthUserRequest
object managing those credentials. Fun times.
I am also open to suggestion as to better methods to diagnose these
kinds
of issues as well, so please, if you have an idea, let's here it.
Also look at the size of body/entity/data in requests taking part in
the NTLM handshake. NTLM requires the HTTP request to be repeated for
each of the handshake steps. This is not too bad on a high speed link,
or with small fetch request. But when uploading say a multi-MB (or GB)
object it can take down a network.
* bandwidth + CPU + time wasted discarding POST/PUT bodies is #1 major
cause of latency from NTLM.
Also, look at winbind concurrent lookup load when latency is observed.
Winbind has an absolute limit somewhere around 256 concurrent
authentications. Each authentication takes the *full* handshake worth of
HTTP request round trips to complete. Additional lookups (TCP
connections) are blocked until winbind can service them.
* winbind queue limits is #2 major cause of latency with NTLM.
Two additional things to be aware of:
* NTLM is officially been obsoleted by MS for some years now, to be
completely End-of-Lifed with WindowsXP and none of their new software
supports it by default.
* Negotiate/Kerberos is supported by all the new software and was
designed to avoid the above latency and bandwidth waste issues.
When comparing auth speeds between MS software from before and after
2007, default use of Kerberos is often the reason newer ones work far
faster and more reliably.
Amos