I'm wondering if I'm not getting the best usage out of my squid cache.
I'm also wondering if there might not be a bug in squid regarding the "-z" switch.
I noted some time ago that when I had 256 top level dirs in my squid
cache, and 256 dirs in each of those, only the first few entries were
being used at the top level (dirs 00,01,02, maybe 03). The other top
level dirs were empty except for the empty subdirs. My cache
size was 6Gb.
I decided if only the first few TLdirs were being used, I probably didn't need 256x256 (65536) directories as most of them were empty.
I decided to try 64x64 (4096) directories instead. I altered the 'squid.conf' to specify cache dir as:
cache_dir ufs /var/cache/squid 6144 64 64
When I ran "squid -z", I got another cache with 256x256 directories under it. I thought it might be an error in initialization and pruned off the extra 192 directories at both levels.
Now I have a directory structure that is 64x64. The cache was created more than a month ago. I know I and my housemate have browsed quite a bit in the past month (have a DSL line). My max object size is set to be 256MegB, so even large files like a linux kernel download would likely be in the cache. However, at this point, only 37 of the 64 Top Level dirs are being used. Also, of note, is that only 1.3G of the 6G allotted is being used.
I counted the number of files/dir and listed the number of directories containing "X" files: # dirs "X" (files in dir) 1748 0 49 1 86 2 98 3 97 4 123 5 134 6 160 7 165 8 162 9 149 10 163 11 143 12 126 13 117 14 99 15 92 16 87 17 59 18 46 19 42 20 41 21 33 22 17 23 15 24 12 25 5 26 13 27 6 28 2 29 1 30 2 31 1 34 1 36 1 37 1 53 ----- ---- 4096 directories 4752 total files stored in all dirs
From this, I see squid has only cached 4752 files !!??....I know from
the daily squid proxy-cache reports that it's not uncommon to fetch
30,000 items in one day with about 909.96% being cache MISSES. So
where are all these items? According the the daily cache log from this morning:
2005/05/06 05:16:10| storeDirWriteCleanLogs: Starting...
2005/05/06 05:16:11| 65536 entries written so far.
2005/05/06 05:16:11| 131072 entries written so far.
2005/05/06 05:16:11| 196608 entries written so far.
2005/05/06 05:16:11| Finished. Wrote 206670 entries.
2005/05/06 05:16:11| Took 0.2 seconds (869922.1 entries/sec).
2005/05/06 05:16:11| logfileRotate: /var/log/squid/access.log
---------
So 1) is it a bug that squid -z ignores the number of dirs in a cache line?
2) Shouldn't object distribution, in the cache, be roughly equal in all "buckets"? I can see from the above, a bell curve with 8-11 files/dir being the most common -- except for the 1748 dirs that are unused?
3) Any ideas on why I'm seeing so little %full in the cache -- shouldn't it be filling up to near the cache limit?
4) Any ideas why I'm not seeing a "flatter" directory usage curve?
5) Of particular interest: why do I only have about 4700 files when the cache dump indicates over 200K entries?
I would expect ~ 200K items in a cache
6) Of note:the refresh in absence of a expiration time is: refresh_pattern . 0 20% 4320
But, as I understand it, the refresh only governs whether or not squid should recheck the last-modified date/time stamp for an object (in absence of a browser asking for "no-cache"/reload). So if items are not changed, they should stay in the cache until it fills up it's quota, at which time the replacement algorithm is used to delete old pages (as per the replacement policy).
No? Yes? Can anyone shed some light on this?
Thanks, -linda