Dear Amos Thank you for reviewing the config and giving your deeply considered comments. On 13 April 2014 09:56, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote: > Did your tests find any actual benefits in these "override-lastmod > override-expire ignore-reload ignore-must-revalidate ignore-private" > settings ? > > My tests earlier showed the reload-into-ims option was all that was > needed to make update caching behave nicely. It is also the only one of > those options which produces RFC compliant behaviour by the proxy. Yes! Clients generate zillions of range requests. This creates loads of revalidation. I have adopted the assumption that exe, cab and such files on windows update servers are static. A different file will take a different URL. Perhaps there are border cases where this assumption would fail, and maybe this needs more thought. Although I think it is fair to guarantee URLs with an embedded SHA1 checksum will always deliver the same content. I might rewrite this part to use reload-inot-ims for URL patterns which don't include a checksum, and use the full override and never expire for those URLs which do embed a checksum. > NP: Squid understands byte units whenever you see "KB" being used in config. > > So: > maximum_object_size 200 MB > maximum_object_size 6 GB > > Which is the first "howler". That directive deoes not take an access > list and only last value set matters. So adding " windowsupdate" to the > 6GB line and setting the 200MB value are both just useless text in the > config file. Ok. I really would like to limit object size on ACL, but will have to live with that! > > >> >> #My internet connection is not just used for Squid. I want to leave >> #responsive bandwidth for other services. This limits D/L speed >> delay_pools 1 >> delay_class 1 1 >> delay_access 1 allow all >> delay_parameters 1 1200000/1200000 > > It is better to use QoS controls in the system network settings that > limit Squid (usually by PID number) than applying a class-1 delay pool > to everything. I do have an iptables firewall set up and will perhaps add that to the bottom of my to-do list, unless I find it ineffectual and problematic. > >> >> #We use the store_id helper to convert windows update file hashes to bare URLs. >> #This way, any fetch for a given hash embedded in the URL will deliver >> the same data >> #You must make your own /etc/squid3/storeid_rewrite instructiosn at end. >> #change the helper program location from >> /usr/local/squid/libexec/storeid_file_rewrite to wherever yours is >> #It is written in PERL, so on most Linux systems, put it somewhere >> convenient, chmod 755 filename >> store_id_program /usr/local/squid/libexec/storeid_file_rewrite >> /etc/squid3/storeid_rewrite >> store_id_children 10 startup=5 idle=3 concurrency=0 >> store_id_access allow windowsupdate >> store_id_access deny all >> > > concurrency=0 is bad. Although I see this is due to a lack of > concurrency in the helper. Thats a bug which should get fixed. > > >> #We want to cache windowsupdate URLs which include queries >> #but only those queries which act on an installable file. >> #we don't want to cache queries on asp files as this is a genuine server >> #side query as opposed to a cache breaker >> acl wupdatecachablequery urlpath_regex >> (cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|appxbundle|esd)\? >> >> #Deny caching for URLs matching query but not windowsupdate >> cache deny QUERY !windowsupdate >> #Deny caching for URLs matching query and windowsupdate but not cachable updates >> cache deny QUERY windowsupdate !wupdatecachablequery > > What does this help with exactly? Current Squid are prefectly capable of > caching despite query-string presence. > In fact we recommend dropping acl QUERY entirely and adding this right > above the '.' refresh_pattern: > refresh_pattern -i (/cgi-bin/|\?) 0 0% 0 I have three classes. Any URL with a query string. Any URL to a windows update server. Any URL to a windows update server which is specifically cache-able To paraphrase the logic coded here: Don't cache anything with a query string UNLESS it matches the ACL wupdatecachablequery. another way to write this more succinctly might be: cache deny QUERY cache allow wupdatecachablequery But I am not certain whether the deny clause will take a higher priority than the allow clause in cases where both ACLs match. The fandangled logic avoids this. > > >> >> #Given windows update is un-cooperative towards third party >> #methods to reduce network bandwidth, it is safe to presume >> #cache-specific headers or dates significantly differing from >> #system date will be unhelpful >> reply_header_access Date deny windowsupdate >> reply_header_access Age deny windowsupdate > > The "given" actually is not true IME. So not a safe assumption. > > Bad behaviour in the HTTP/1.1 revalidation by clients is a common side > effect of the override-* and ignore-* options being used on refresh_pattern. > The overrides used above make Squid ignore the caching boundary > conditions about when objects become stale or expire. So the client > fetch can a) MISS earlier than necessary, or b) HIT on a stale object > with headers indicating it is obsolete well before delivery time - > client DO resolve that by re-fetching with a forced reload. In (a) > refreshing uses full-object bandwidth more frequently than necessary, in > (b) repairing the corrupted objects costs 2x bandwidth a normal MISS > would have cost. > > When reload-into-ims is used Squid translates annoying reload behaviour > into friendlier refresh behaviour. At worst Squid is required to do a > revalidation (almost no cost in bandwidth) to update the timestamps on > content delivered to the client. Avoiding problem (b) above entirely is > well worth that (very small) extra time delay on occasional WU. > > Caching and revalidation seems in my experience to be performed properly > by the windows update tools. At least in WindowsXP SP2 and Windows 7 > which I have tested on. >> >> #Put the two following lines in /etc/squid3/storeid_rewrite ommitting >> the starting hash >> #^http:\/\/.+?\.ws\.microsoft\.com\/.+?_([0-9a-z]{40})\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd) >> http://wupdate.squid.local/$1 >> #^http:\/\/.+?\.windowsupdate\.com\/.+?_([0-9a-z]{40})\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd) >> http://wupdate.squid.local/$1 I'll update these patterns to be server agnostic. I'll update the refresh pattern to account for whether a URL has an embedded checksum. If not, use reload-into-ims else assume it is guaranteed static.