Re: squid on 32-bit system with PAE and 8GB RAM

Marcello Romani <mromani@xxxxxxxxxxxxxxx> · Wed, 18 Mar 2009 09:02:55 +0100

Gavin McCullagh ha scritto:
Hi,

I don't mean to labour this, I'm just keen to understand better and
obviously you guys are the experts on squid.

While I'm definitely not an expert on squid, I like this thread, so I'll 
put another 2 (euro)cents on the table...

On Mon, 16 Mar 2009, Marcello Romani wrote:

Really?  I would have thought the linux kernel's disk caching would be far
less optimised for this than using a large squid cache_mem (whatever about
a ramdisk).
As others have pointed out, squid's cache_mem is not used to serve  
on-disk cache objects, while os's disk cache will hold those objects in  
RAM after squid requests them for the first time.

Agreed.  I would have thought though that a large cache_mem would be a
better way to increase the data served from RAM, compared to the OS disk
caching.  

I imagine, perhaps incorrectly, that squid uses the mem_cache first for
data, then when it's removed (by LRU or whatever), pushes it out to the
disk cache.  This sounds like it should lead to a pretty good

Maybe you have already seen this page, but I suggest nonetheless to have 
a look at

http://wiki.squid-cache.org/SquidFaq/SquidMemory

which has detailed info about squid memory usage.

mem_cache:disk_cache serving ratio.  I don't have much to back this up, but
the ratio in my own case is pretty high so squid appears not to just treat
all caches (memory and disk) equally.

http://deathcab.gcd.ie/munin/gcd.ie/watcher.gcd.ie-squid_cache_hit_breakdown.html

Looking at the graph I see an average 15% memory hit ratio vs. an 
average 35% disk hit ratio. I would try to lower cache_mem to allow more 
of that 35% to be served from disk cache ram, and see if this makes some 
difference.

By comparison, I would expect linux's disk caching, which has no
understanding of the fact that this is a web proxy cache, to be less smart.
Perhaps that's incorrect though, I'm not sure what mechanism linux uses.

So if you leave most of RAM to OS for disk cache you'll end up having  
many on-disk object loaded from RAM, i.e. very quickly.

Some, but I would imagine not as many as with mem_cache.

Squid keeps the hottest (i.e. the most requested) objects in cache_mem, 
so if that value is too low, those objects will be served from disk.
But if your disk cache is big enough, then those objects will not be 
read from the phisical disk, but from the disk cache, which is almost as 
fast as cache_mem, imho (I guess cache_mem is faster, but on the kind of 
hardware you have I don't think the speed difference would be visible 
from users. This is only a guess, of course...)

Also, squid needs memory besides cache_mem, for its own internal  
structures and for managing the on-disk repository. If its address space  
is already almost filled up by cache_mem alone, it might have problems  
allocating its own memory structures.

Absolutely agreed and the crashes I've seen appear to be caused by this,
though dropping to around 1.7GB mem_cache appears to cure this.  

OK

The question then is, which would be better, an extra cache based on a
ramdisk, or just leaving it up to the kernel's disk caching.  

OS's disk cache, on the other hand, is not allcated from squid's process  
memory space and has also a variable size, automatically adjusted by the  
OS when app memory needs grow or shrink.

Right.  A ramdisk is also not allocated from squid's process space either,
but it doesn't shrink in the way linux disk caching would and that might
cause swapping in a bad situation.  That's a clear advantage for linux's
caching.  Simplicity is another clear advantage.

The question I'm left with is, which of the two would better optimise the
amount of data served from ram (thus lowering iowait), linux's caching or
the ramdisk?

I guess it's not a very normal setup, so maybe nobody has done this.

Thanks for all the feedback,
Gavin

Recently on this list someone suggested using more than one squid 
process on the same server to make better use of SMP hardware and the 
huge amount of ram a 32 bit squid cannot directly address. I don't 
remember the thread title right now, but maybe some additional foot for 
thought can be found in the recent archives.

I guess those were my last 2 cents. :-)

HTH

--
Marcello Romani