Re: Trying to improve the Byte Hit Ratio, any tips ?

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Tue, 06 Jan 2009 17:49:10 +1300

Vianney Lejeune wrote:
Hello,

    I'm trying to improve the Byte Hit Ratio of SquidCache on my 
network. There is 220 computers in the LAN, using internet on a general 
usage basis. The maximum bandwidth is 4Mbps in/out, the total amount of 
data is estimated to be 30 to 60 Gbytes daily.

This is the report from cachemgr:
=>
    Average HTTP requests per minute since start:    1023.9
    Average ICP messages per minute since start:    0.0
    Select loop called: 1208577 times, 5.619 ms avg
Cache information for squid:
    Request Hit Ratios:    5min: 37.9%, 60min: 41.1%
    Byte Hit Ratios:    5min: 13.2%, 60min: 13.8% (It's quite low, these 
values are usual)
    Request Memory Hit Ratios:    5min: 2.0%, 60min: 2.6% (I rebooted 
the server 3 hours ago, this can explain these low values)
    Request Disk Hit Ratios:    5min: 41.3%, 60min: 36.3%
    Storage Swap size:    27654312 KB
    Storage Mem size:    190364 KB
    Mean Object Size:    29.65 KB
    Requests given to unlinkd:    33035
Median Service Times (seconds)  5 min    60 min:
    HTTP Requests (All):   0.23230  0.46965
    Cache Misses:          0.35832  0.72387
    Cache Hits:            0.19742  0.35832
    Near Hits:             0.20843  0.55240
    Not-Modified Replies:  0.03829  0.05331
    DNS Lookups:           0.00094  0.00779
    ICP Queries:           0.00000  0.00000
<=

This is my squid.conf file:
=>

http_port 3128 transparent
hierarchy_stoplist cgi-bin ?

acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY

Without cache peers you can drop the above QEURY acl.
That will raise both hit ratios on semi-dynamic objects.
BUT, see addition to refresh_pattern below...

acl apache rep_header Server ^Apache
broken_vary_encoding allow apache
maximum_object_size 128 MB

Re: the above maximum. There may be huge objects going through that can 
be cached.

> cache_mem 250 MB
maximum_object_size_in_memory 50 KB

memory, memory, memory. The more you can throw at the problem the more 
objects can be kept and served while hot. Squid with 64-bit can easily 
handle many GBs of memory cache. (at cost of slow shutdown when it saves 
the hottest to disk for the next round.)

cache_replacement_policy heap LFUDA

Been a while since I looked at these, to maximize bytes you want the 
policy that looks at object size as well as 'coldness'. To remove the 
smaller cool objects before the larger equally cool ones.

cache_dir ufs /data/spool/squid 30000 16 256

Your cache dir is only 30GB. Thats one days traffic or less by your 
above statements.  For good hit ratios you may need at least 7 days, 
preferrably as close to 30 as possible.

Depending on your OS, AUFS(Linux) or diskd(*BSD) may prove much faster 
access than UFS.

access_log none
cache_log none

The above is generating log file named "none". Would be more useful to 
set debug_options ALL,0.  If you really don't want to know about the 
critical problems that do happen then set filename to /dev/null as well.

cache_store_log none
log_ip_on_direct off
hosts_file /etc/hosts
refresh_pattern ^ftp:        1440    20%    10080
refresh_pattern ^gopher:    1440    0%    1440

without QUERY acl above, you wil need this right here in the pattern order:
 refresh_pattern -i (/cgi-bin/|\?)  0 0% 0

refresh_pattern .        0    20%    4320
quick_abort_min 0 KB
quick_abort_max 0 KB
range_offset_limit 0 KB

Be careful, but you may want to play at setting these to continue 
downloads. (quick_abort -1 KB)
That will cause all partial and restarted downloads to become HIT later. 
At risk of some wastage.

half_closed_clients off
shutdown_lifetime 0 seconds
acl all src 0.0.0.0/0.0.0.0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl to_localhost dst 127.0.0.0/8
acl SSL_ports port 443        # https
acl SSL_ports port 563        # snews
acl SSL_ports port 873        # rsync
acl Safe_ports port 80        # http
acl Safe_ports port 21        # ftp
acl Safe_ports port 443        # https
acl Safe_ports port 70        # gopher
acl Safe_ports port 210        # wais
acl Safe_ports port 1025-65535    # unregistered ports
acl Safe_ports port 280        # http-mgmt
acl Safe_ports port 488        # gss-http
acl Safe_ports port 591        # filemaker
acl Safe_ports port 777        # multiling http
acl Safe_ports port 631        # cups
acl Safe_ports port 873        # rsync
acl Safe_ports port 901        # SWAT
acl purge method PURGE
acl CONNECT method CONNECT
acl ReseauLocal src 10.0.0.0/16
http_access allow manager localhost
http_access deny manager
http_access allow purge localhost
http_access deny purge
http_access allow localhost
http_access allow ReseauLocal
http_access deny all
http_reply_access allow all
icp_access deny all
cache_effective_group proxy
httpd_suppress_version_string on
via off
forwarded_for off
log_icp_queries off
client_db off
coredump_dir /var/spool/squid
pipeline_prefetch off
<=

Do you see something that need to be improved ? Did I miss something?

Theres a lot of tweaks with refresh_pattern that can be done to warp 
things into cache longer than they are supposed to be stored. I won't 
advocate any though.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE5 or 3.0.STABLE11
  Current Beta Squid 3.1.0.3