Re: Long running squid proxy slows way down

"Amos Jeffries" <squid3@xxxxxxxxxxxxx> · Mon, 27 Apr 2009 13:19:07 +1200 (NZST)

> Hi,
>
> On Sun, 26 Apr 2009, Amos Jeffries wrote:
>
>> almost. The final one is:
>>  -> aggressive until swap_usage < cache_swap_low
>>  which could be only whats currently indexed (cache_swap_log), or could
>> be less since aggressive might re-test objects for staleness and discard
>> to reach its goal.
>
> I had presumed that squid had a heap or other $STRUCTURE which kept the
> cache objects in order of expiry so they could be purged immediately they
> expired.  Thinking about it though, perhaps that would kill off all
> possibility for TCP_IMS_HITs?
>

Squid has several methods (replacement policies) of heaps and lists for
removal of objects.

(I'm groping in the dark here from quick looks at the code, so this is not
authoritative info anymore).

The first layer seems to be a list of 'discarded' files (FileNums) which
have been declared useless or replaced by newer data but not yet removed
from the physical disk storage. AFAICT thats the difference-list between
actually stored data and the cache_swap_log index.

Second is the replacement policy for finding the second round of objects
to remove under aggressive if the first round was not enough. I don't know
why at this stage (and can't think of a reason why it should), but from
all appearances it calls refresh_patterns again.

On Squid-2 there is also the header-updating mechanism that every 3xx
reply copies the store object from disk to either memory or another disk
file. In the process updating the stored info (attempting to fix bug #7).
This has performance impacts and race conditions all of its own which need
solving before it can be used in Squid-3.

> Sorry to be constantly peppering you with these questions, I just find it
> all very interesting :-)

No problem.

>
>>>  Would it be better to calculate an absolute figure (say
>>> 200MB) and work out what percentage of your cache that is?  It seems
>>> like
>>> the 95% high watermark is probably quite low for large caches too?
>>
>> I agree. Something like that. AFAICT the high being less than 100% is to
>> allow X amount of new data to arive and be stored between collection
>> cycles. 6 GB might be reasonable on a choked-full 100 MB pipe with 5
>> minute cycles. Or it might not.
>
> As I mentioned we have a 20GB gap by default and are on a 40MB pipe which
> is often quite choked.  I can't say we've noticed the collection cycles
> but
> maybe we're not measuring it right.
>
> I'll probably change the thresholds to 98%,99%.

My back-of-envelope calculations for a 40MB pipe indicate that (assuming
_everything_ must be cached >0 seconds) a 6GB gap would be sufficient.
That does not account for IMS_HITS and non-cachable MISS though, which
reduce the gap needs further.

>
>>>  Would it make more sense for squid
>>> to offer absolute watermarks (in MB offset from the total size)?
>>
>> Yes this is one of the ancient aspects remaining in Squid and different
>> measures may be much better. I'm having a meeting with Alex Rousskov in
>> approx 5 hours on IRC (#squiddev on irc.freenode.net) to discuss the
>> general store improvements for 3.2. This is very likely to be one of the
>> topics.
>
> Please do let us know how you get on :-)

We missed meeting up unfortunately. Will be trying again another time.

Amos