On 16/04/11 08:52, Henry Yuan wrote:
Hi folks,
A caveat, I don't have practical network management experience, so the
following could be totally nonsense ....
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1) To my understanding, transparent caching proxies (on the edge of
the network) basically hijacks the http traffic by stealing the
packets from the wire.
There are several many meanings to "transparent". Squid does three of
those meanings directly itself and the other three when extended with
additional software.
Assuming you mean the "NAT interception" that is correct understanding.
There is no "basically" about it though. "Hijack" or "man-in-middle
attack" are the right descriptions for interception of any kind.
Since the hosts are not aware of the existence of this proxy, the
Dest. Mac address in those packets will not be the same as the one of
the proxy host.
Not true. The destination MAC is simply the *next* relay host for the
packet. MAC changes with every NIC the packet travels over. You need to
lookup a bit on how TCP, MAC and ARP work.
In summary:
If you place Squid on a router, the dest MAC will be the routers aka
Squids.
If you place Squid on a third box and pass them over from a router
(via policy route or tunnel), the dest MAC will be Squid and src will be
the router.
If you place Squid on a bridge and capture packets, the MAC will be
indeterminate but usually the bridge NIC or the NIC of the router just
past the bridge.
In other words, I am assuming the transparent proxy is like:
Client ------------ (Squid) ------------ Server
Where the squid can steal the packets silently...
The questions are:
- Why wouldn't the proxy host drop the packets with different MAC address?
see above.
- What's the role of NAT in this setup?
To change the port number from 80 to whatever Squid is listening on, and
mark the packet as going to the Squid host instead of relayed out to the
network unchanged. see below.
- Why doesn't Squid monitor port 80 directly? (I asked this question
is a previous post, I'm just listing it here to make my questions
clear)
This diagram of how packet handling works in Linux should make it
clearer: http://l7-filter.sourceforge.net/PacketFlow.png
Looking at the top (green) layer the packets come in and NAT changes
them so that the "Routing decision" passes them *up* to the "local
processes (aka Squid) instead of passing *straight* out the other side
of the box.
When running on a bridge there are extra ebtables rules to change the
"Bridging decision" down in the blue layer as well.
That is specific to iptables, but all packet handlers have a similar
distinction between local and non-local packets.
<snip>
- Can Squid monitor on port 80 directly?
Only if the client thinks that the squid box IP and port are where the
domain is hosted. This is the main difference between intercept and
reverse proxy.
Amos
--
Please be using
Current Stable Squid 2.7.STABLE9 or 3.1.12
Beta testers wanted for 3.2.0.6