Re: Linking Squid Logs

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Thu, 1 Apr 2021 20:13:14 +1300

On 1/04/21 6:59 am, Garbacik, Joe wrote:
In my squid.conf, I have the following logformat which passes all the 
data from the client via the load balancer to the squid server as headers:

...

This creates the two logs at the end of this message, What I am 
wondering is:

 1. Why aren't all the request headers (look between * ** *REQUEST
    HEADERS and *** RESPONSE HEADERSin each log) seen in the first log
    present in the second log

They are different transactions.

 2. I'm assuming since squid is then making the request in the second
    log, it leaves the items in Flow0 (client load balancer) empty but
    does retain the data for flow1 (load-balancer-> squid)and flow2
    (squid -> destination). Even the XFF is not passed. It there anyway
    to included retain this data?

First log entry is an HTTP request to initiate (CONNECT) a tunnel.

Second log entry is an HTTPS request to fetch (GET) data from a server.

What is happening is;

In the beginning there exists a TCP connection between Haproxy and 
Squid. Transferring HTTP messages.

One of those messages is a CONNECT request. Meaning connect this current 
TCP connection to the named server:port and stop performing HTTP - all 
following bytes will be some other protocol.

When Squid acknowledges the tunnel is connected bytes initiating a TLS 
connection start arriving. Squid does its SSL-Bump things to look inside.

The CONNECT message and state related to it are now complete. It gets 
logged and discarded.
  ** this messages is your log line #1.

What is found inside the TLS is a private HTTP communication channel 
with its own *fully separate* HTTP messages going on between the client 
and server. Squid starts acting as an interception proxy for those messages.
  ** one of these messages is your log line #2.

Notice firstly that the CONNECT message is only between the client and 
Squid. There are no HTTP headers or such going to the server for tunnel 
setup - just a TCP CYN packet.

Notice secondly that the intercept-proxy/SSL-bump decrypted HTTP 
messages have no relationship to the CONNECT or any prior forward-proxy 
HTTP messages on the TCP connection. They only thing they have in common 
is that they arrived on the same TCP connection between haproxy and Squid.
 If there is actually a relationship between them it might be visible 
in the fact that haproxy received both from the same client at its end

   ...  or not. Because we don't know whether haproxy can actually see 
the origin client or just another proxy multiplexing traffic into _that_ 
TCP connection.

 3. Is there a way to generate an unique Id for each flow so, besides
    the data in flow0, once can easily link these logs together? 

That can only be done reliably by the client itself sending an HTTP 
header in all messages with its flow ID.

Otherwise the closest you can get is to define "flow" as everything from 
a haproxy ingress = { src-IP, src-port, dst-IP, dst-port, squid 
local-IP, squid local-port } to Squids egress = { src-IP, src-port, 
dst-IP, dst-port, dst-domain }.

Amos

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users