On 1/04/21 6:59 am, Garbacik, Joe wrote:
In my squid.conf, I have the following logformat which passes all the
data from the client via the load balancer to the squid server as headers:
...
This creates the two logs at the end of this message, What I am
wondering is:
1. Why aren't all the request headers (look between * ** *REQUEST
HEADERS and *** RESPONSE HEADERSin each log) seen in the first log
present in the second log
They are different transactions.
2. I'm assuming since squid is then making the request in the second
log, it leaves the items in Flow0 (client load balancer) empty but
does retain the data for flow1 (load-balancer-> squid)and flow2
(squid -> destination). Even the XFF is not passed. It there anyway
to included retain this data?
First log entry is an HTTP request to initiate (CONNECT) a tunnel.
Second log entry is an HTTPS request to fetch (GET) data from a server.
What is happening is;
In the beginning there exists a TCP connection between Haproxy and
Squid. Transferring HTTP messages.
One of those messages is a CONNECT request. Meaning connect this current
TCP connection to the named server:port and stop performing HTTP - all
following bytes will be some other protocol.
When Squid acknowledges the tunnel is connected bytes initiating a TLS
connection start arriving. Squid does its SSL-Bump things to look inside.
The CONNECT message and state related to it are now complete. It gets
logged and discarded.
** this messages is your log line #1.
What is found inside the TLS is a private HTTP communication channel
with its own *fully separate* HTTP messages going on between the client
and server. Squid starts acting as an interception proxy for those messages.
** one of these messages is your log line #2.
Notice firstly that the CONNECT message is only between the client and
Squid. There are no HTTP headers or such going to the server for tunnel
setup - just a TCP CYN packet.
Notice secondly that the intercept-proxy/SSL-bump decrypted HTTP
messages have no relationship to the CONNECT or any prior forward-proxy
HTTP messages on the TCP connection. They only thing they have in common
is that they arrived on the same TCP connection between haproxy and Squid.
If there is actually a relationship between them it might be visible
in the fact that haproxy received both from the same client at its end
... or not. Because we don't know whether haproxy can actually see
the origin client or just another proxy multiplexing traffic into _that_
TCP connection.
3. Is there a way to generate an unique Id for each flow so, besides
the data in flow0, once can easily link these logs together?
That can only be done reliably by the client itself sending an HTTP
header in all messages with its flow ID.
Otherwise the closest you can get is to define "flow" as everything from
a haproxy ingress = { src-IP, src-port, dst-IP, dst-port, squid
local-IP, squid local-port } to Squids egress = { src-IP, src-port,
dst-IP, dst-port, dst-domain }.
Amos
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users