After this month's "Black Tuesday" (the Tuesday on which Microsoft
released a large number of bug fixes and patches and also
eliminated the pacing of Windows XP Service Pack 2 downloads), our
Squid caches went berserk, drawing massive amounts of data from the
Net and clogging our ISP's downstream feeds. Upon inspection, we
saw what went wrong. Windows Update downloads updates by requesting
portions of files -- as little as 300 and as much as several
thousand bytes -- via HTTP. Unfortunately, when a Squid proxy is
between the Windows Update client and the Internet, this wreaks
havoc. When the first request occurs, the Squid proxy downloads the
entire file before providing the subrange of bytes to the client
(perhaps making the reasonable assumption that it will ask for
other portions later). But when the client makes its next request,
Squid queries the Windows update server and is told that its
current copy of the file is out of date. So, it transfers the
entire file AGAIN. (If you're interested, I can send tcpdump output
showing this. It has clients' addresses, so I probably shouldn't
post it publicly.) The smaller the chunks requested by the client,
the larger the wasted bandwidth.
It seems to make no difference if one sets "reload-into-ims" or
even "ignore-reload" and "override-expire" and "override-lastmod"
for Windows Update downloads. That's right: you can set
refresh_pattern download\.microsoft\.com 144000 100% 144000
ignore-reload override-expire override-lastmod
and Squid still reports misses on successive accesses to the same URL.
Can this problem be diagnosed and fixed? It's causing such a
massive waste of bandwidth that we're looking at dumping Squid.
--Brett Glass