Search squid archive

Re: HDD/RAM Capacity vs store_avg_object_size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/07/17 22:31, bugreporter wrote:
Hi,

Can anybody help me to confirm my understanding of the memory usage vs the
persistent cache capacity? Below my understanding:

According to http://wiki.squid-cache.org/SquidFaq/SquidMemory:

1- We need 14 MB of memory per 1 GB on disk for 64-bit Squid.The wiki is
there since I know squid (ie. i'm very old now). Is this information still
valid?

Yes. It is a rough estimate based on the size of code objects used to store each request message - they have not changed in at least the past 10 years. There may be some variance based on extra headers modern HTTP contains. But that is not a huge amount and the number is a rough estimate to begin with.




2- Is this assumption based on the default value of 13 KB for
*store_avg_object_size*?

No.

That avg object size is for the full object with payload. Those payloads are stored inside cache_mem or cache_dir, and do not take up index space. So have a total limit of whatever you configure those storage areas to be.

Squid uses the above directive for its startup initialization of the index's hash table. The table can be changed dynamically, but that is quite expensive in terms of CPU cycles and would delay some requests so this is a nice shortcut to avoid most pauses.


The 10 or 14 MB is purely for the metadata necessary to index those cached objects. Which is the HTTP message header text plus a bunch of Squid code objects.



3- If answers to questions above are both YES, can we deduce that we need
*182* bytes in memory per object in the persistent cache on 64x system?
[*182* = (14 * 1024 * 1024) / (1024 * 1024 / store_avg_object_size)]

If you want to re-do the calculations for your own proxy start with the values from the cachemgr "mem" report.

To get the metadata size add the per-object sizes (first number column) of HttpReply + MemObject + HttpHeaderEntry + all objects whose name starts with HttpHdr* + StoreEntry + all objects whose name starts with StoreMeta*.

The rest is harder. You need to do a scan of a disk cache separating the message headers - both counting the number of items found and total size of the headers processed. Multiplying the metadata size by the number of objects in the cache and adding the total message header size.

You now have total index size and total cache size for a given cache. Getting the N per GB from that should be easy and obvious.



NP: The mgr:mem "In Use" count of StoreEntry gives you approximately the number of currently indexed objects. Though it does includes some non-cacheable objects being replied to currently so not completely accurate. You can use that to see how the index memory use compares to the memory use for extra in-transit data.



4- Today the *store_avg_object_size* should be really greater than 13 KB.
The mean object size I can see on my own cache is about 100 KB. Can anybody
refer me to a website where I can find fresh information?

The value for your particular Squid can be found in the cachemgr "info" report. It is listed as "Mean Object Size".

It varies between proxies, and is directly dependent on what your particular cache settings are compared to the traffic that proxy sees. So even two proxies receiving the same traffic might show very different values and it is unlikely that any reference material you find by other people will be anything more than a rough approximation.


For example; my test proxy caching ISP-type traffic, with a fair bit of Facebook, YouTube etc. going through it:
"
	Mean Object Size:	106.08 KB
"

and a production CDN proxy in front of mostly Wordpress sites:
"
	Mean Object Size:	19.20 KB
"

Both with a 200 GB cache_dir and otherwise default cache settings.




5- If I'm completely on a wrong way, can anybody help me to find a formula
that can help me to deduce the required RAM for a given HDD capacity (and
vice versa).


Still the same one listed in the wiki page.

Though nowdays the 2^27 objects per cache_dir limitation is proving to be far more restrictive than the RAM index size. So depending on your "Mean Object Size" you may find yourself limited to only using 100 GB or less of a TB HDD.

Amos
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux