Re: Mixing delay pools and slow aborts

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Fri, 12 Jun 2009 03:42:43 +1200

Mike Crowe wrote:
I'm using Squid (Debian Lenny stable 3.0.STABLE8-3) as a mechanism for
pre-caching large downloads for a number of client hosts that are
powered down during the night so that the files they require are ready
for them to download in the morning. In the late afternoon each of the
clients connects to Squid, starts downloading each file and then
disconnects as soon as the data starts flowing. I set "quick_abort_min
-1" to ensure that Squid continues with the download regardless.

I now need to limit the bandwidth used by Squid when caching the
files. I initially experimented with Linux traffic control but found
the server just stopped sending packets on some of the connections
after a while due to not making any progress. Fiddling with timeouts
didn't seem to be fully effective. Squid's built in delay pool system
worked much better and didn't result in any dropped connections even
when concurrently downloading a hundred large files at 100kbit/s. Even
the fact that more bandwidth is available during certain hours could
easily be handled using "acl time"[1].

Here's the interesting parts of my current configuration:

 acl nighttime time 00:00-05:00
 delay_pools 2
 delay_class 1 1
 delay_class 2 1
 delay_access 1 allow nighttime
 delay_access 1 deny !nighttime
 delay_access 2 allow !nighttime
 delay_access 2 deny nighttime
 delay_parameters 1 120000/120000
 delay_parameters 2 12000/12000

But, as the FAQ states, once the client disconnects there is no longer
any link back to the delay pool and the download proceeds at full
speed. :(

I've been trying to come up with workarounds for this so that I can
keep both the bandwidth shaping behaviour and slow abort.

My understanding of the Squid code is minimal but looking at
MemObject::mostBytesAllowed() I wonder whether it might be possible
for MemObject to store the last DelayId so that when the clients list
is empty it has something to fall back on? This may be ineffective if
all clients have disconnected before the first read but perhaps that
can be fixed by setting the persistent DelayId in
MemObject::addClient() too.

This may be suitable for your needs, but is not optimal elsewhere. Why 
should the last visiting client be penalized for an admin configuration 
choice?
Their followup pool will then be reduced by the bandwidth used during 
the no longer needed request for the duration of the finishing-up pull.

Alternatively I've wondered whether I could write a redirector or
external ACL helper which effectively ran wget through the proxy for
every URL received taking care not to duplicate URLs that it has
submitted. I believe that this solution could also easily be extended
to support retrying interrupted downloads too which would be a bonus.

You could. Noting that client requests which test it will hang until the 
ACL returns, so its probably better to scan the log periodically and 
re-fetch.

Does anyone with greater knowledge than me have any comments on these
proposals or any better ideas?

Thanks.

Mike.

[1] Although in my testing it appeared that the bandwidth did not
change at the prescribed time for a download that was already in
progress.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
  Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1