Hi Jasper I have compiled 3.4 to provide the store_id functionality implemented by Ellizer. I have it running in a production heterogeneous environment. I'm still checking for bugs, but seems to work well. #squid.conf file for Squid Cache: Version 3.4.4 #compiled on Ubuntu with configure options: '--enable-async-io=8' '--enable-storeio=ufs,aufs,diskd' '--enable-removal-policies=lru,heap' #'--enable-delay-pools' '--enable-underscores' '--enable-icap-client' '--enable-follow-x-forwarded-for' '--with-logdir=/var/log/squid3' #'--with-pidfile=/var/run/squid3.pid' '--with-filedescriptors=65536' '--with-large-files' '--with-default-user=proxy' #'--enable-linux-netfilter' '--enable-storeid-rewrite-helpers=file' #Recommendations: in full production, you may want to set debug options from 2 to 1 or 0. #You may also want to comment out strip_query_terms off for user privacy logformat squid %tg.%03tu %6tr %>a %Ss/%03>Hs %<st %rm %ru %[un %Sh/%<a %mt #Explicitly define logs for my compiled version cache_store_log /var/log/squid3/store.log access_log /var/log/squid3/access.log cache_log /var/log/squid3/cache.log #Lets have a fair bit of debugging info debug_options ALL,2 #Include query strings in logs strip_query_terms off acl all src all #Which domains do windows updates come from? acl windowsupdate dstdomain .ws.microsoft.com acl windowsupdate dstdomain .download.windowsupdate.com acl QUERY urlpath_regex cgi-bin \? #I'm behind a NAT firewall, so I don't need to restrict access http_access allow all #Uncomment these if you have web apps on the local server which auth through local ip #acl to_localhost dst 127.0.0.0/8 0.0.0.0/32 #http_access deny to_localhost visible_hostname myclient.hostname.com http_port 3128 #Always optimise bandwidth over hits cache_replacement_policy heap LFUDA #Windows update files are HUGE! I have set this to 6Gb. #A recent (as of Apr 2014) windows 8 update file is 4Gb maximum_object_size 6 GB #Set these according to your file system cache_dir ufs /home/smb/squid/squid 70000 16 256 coredump_dir /home/smb/squid/squid #Guaranteed static content from Microsoft. Usually fetched with range requests so lets not revalidate. Underscore, 40 hex(SHA1 hash) . extension refresh_pattern _[0-9a-f]{40}\.(cab|exe|esd|psf|zip|msi|appx) 518400 80% 518400 override-lastmod override-expire ignore-reload ignore-must-revalidate ignore-private #Otherwise potentially variable refresh_pattern -i ws.microsoft.com/.*\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd) 43200 80% 43200 reload-into-ims refresh_pattern -i download.windowsupdate.com/.*\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd) 43200 80% 43200 reload-into-ims #Default refresh patterns last if no others match refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern . 0 20% 4320 #Directive sets I have been experimenting with #override-lastmod override-expire ignore-reload ignore-must-revalidate ignore-private #reload-into-ims #Windows updates use a lot of range requests. The only way to deal with this #in Squid is to fetch the whole file as soon as requested range_offset_limit -1 windowsupdate quick_abort_min -1 KB windowsupdate #My internet connection is not just used for Squid. I want to leave #responsive bandwidth for other services. This limits D/L speed delay_pools 1 delay_class 1 1 delay_access 1 allow all delay_parameters 1 1200000/1200000 #We use the store_id helper to convert windows update file hashes to bare URLs. #This way, any fetch for a given hash embedded in the URL will deliver the same data #You must make your own /etc/squid3/storeid_rewrite instructiosn at end. #change the helper program location from /usr/local/squid/libexec/storeid_file_rewrite to wherever yours is #It is written in PERL, so on most Linux systems, put it somewhere convenient, chmod 755 filename store_id_program /usr/local/squid/libexec/storeid_file_rewrite /etc/squid3/storeid_rewrite store_id_children 10 startup=5 idle=3 concurrency=0 store_id_access allow windowsupdate store_id_access deny all #We want to cache windowsupdate URLs which include queries #but only those queries which act on an installable file. #we don't want to cache queries on asp files as this is a genuine server #side query as opposed to just a cache breaker acl wupdatecachablequery urlpath_regex (cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|appxbundle|esd)\? cache allow windowsupdate wupdatecachablequery cache deny QUERY #Given windows update is un-cooperative towards third party #methods to reduce network bandwidth, it is safe to presume #cache-specific headers or dates significantly differing from #system date will be unhelpful reply_header_access Date deny windowsupdate reply_header_access Age deny windowsupdate #Put the following line in /etc/squid3/storeid_rewrite ommitting the starting hash. Tab separates fields #_([0-9a-z]{40})\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd) http://wupdate.squid.local/$1 root@ubuntuserver:/etc/squid3# jed squid.conf root@ubuntuserver:/etc/squid3# cat squid.conf #squid.conf file for Squid Cache: Version 3.4.4 #compiled on Ubuntu with configure options: '--enable-async-io=8' '--enable-storeio=ufs,aufs,diskd' '--enable-removal-policies=lru,heap' #'--enable-delay-pools' '--enable-underscores' '--enable-icap-client' '--enable-follow-x-forwarded-for' '--with-logdir=/var/log/squid3' #'--with-pidfile=/var/run/squid3.pid' '--with-filedescriptors=65536' '--with-large-files' '--with-default-user=proxy' #'--enable-linux-netfilter' '--enable-storeid-rewrite-helpers=file' #Recommendations: in full production, you may want to set debug options from 2 to 1 or 0. #You may also want to comment out strip_query_terms off for user privacy logformat squid %tg.%03tu %6tr %>a %Ss/%03>Hs %<st %rm %ru %[un %Sh/%<a %mt #Explicitly define logs for my compiled version cache_store_log /var/log/squid3/store.log access_log /var/log/squid3/access.log cache_log /var/log/squid3/cache.log #Lets have a fair bit of debugging info debug_options ALL,2 #Include query strings in logs strip_query_terms off acl all src all #Which domains do windows updates come from? acl windowsupdate dstdomain .ws.microsoft.com acl windowsupdate dstdomain .download.windowsupdate.com acl QUERY urlpath_regex cgi-bin \? #I'm behind a NAT firewall, so I don't need to restrict access http_access allow all #Uncomment these if you have web apps on the local server which auth through local ip #acl to_localhost dst 127.0.0.0/8 0.0.0.0/32 #http_access deny to_localhost visible_hostname myclient.hostname.com http_port 3128 #Always optimise bandwidth over hits cache_replacement_policy heap LFUDA #Windows update files are HUGE! I have set this to 6Gb. #A recent (as of Apr 2014) windows 8 update file is 4Gb maximum_object_size 6 GB #Set these according to your file system cache_dir ufs /home/smb/squid/squid 70000 16 256 coredump_dir /home/smb/squid/squid #Guaranteed static content from Microsoft. Usually fetched with range requests so lets not revalidate. Underscore, 40 hex(SHA1 hash) . extension refresh_pattern _[0-9a-f]{40}\.(cab|exe|esd|psf|zip|msi|appx) 518400 80% 518400 override-lastmod override-expire ignore-reload ignore-must-revalidate ignore-private #Otherwise potentially variable refresh_pattern -i ws.microsoft.com/.*\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd) 43200 80% 43200 reload-into-ims refresh_pattern -i download.windowsupdate.com/.*\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd) 43200 80% 43200 reload-into-ims #Default refresh patterns last if no others match refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern . 0 20% 4320 #Directive sets I have been experimenting with #override-lastmod override-expire ignore-reload ignore-must-revalidate ignore-private #reload-into-ims #Windows updates use a lot of range requests. The only way to deal with this #in Squid is to fetch the whole file as soon as requested range_offset_limit -1 windowsupdate quick_abort_min -1 KB windowsupdate #My internet connection is not just used for Squid. I want to leave #responsive bandwidth for other services. This limits D/L speed delay_pools 1 delay_class 1 1 delay_access 1 allow all delay_parameters 1 1200000/1200000 #We use the store_id helper to convert windows update file hashes to bare URLs. #This way, any fetch for a given hash embedded in the URL will deliver the same data #You must make your own /etc/squid3/storeid_rewrite instructiosn at end. #change the helper program location from /usr/local/squid/libexec/storeid_file_rewrite to wherever yours is #It is written in PERL, so on most Linux systems, put it somewhere convenient, chmod 755 filename store_id_program /usr/local/squid/libexec/storeid_file_rewrite /etc/squid3/storeid_rewrite store_id_children 10 startup=5 idle=3 concurrency=0 store_id_access allow windowsupdate store_id_access deny all #We want to cache windowsupdate URLs which include queries #but only those queries which act on an installable file. #we don't want to cache queries on asp files as this is a genuine server #side query as opposed to just a cache breaker acl wupdatecachablequery urlpath_regex (cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|appxbundle|esd)\? cache allow windowsupdate wupdatecachablequery cache deny QUERY #Given windows update is un-cooperative towards third party #methods to reduce network bandwidth, it is safe to presume #cache-specific headers or dates significantly differing from #system date will be unhelpful reply_header_access Date deny windowsupdate reply_header_access Age deny windowsupdate #Put the following line in /etc/squid3/storeid_rewrite ommitting the starting hash. Tab separates fields #_([0-9a-z]{40})\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd) http://wupdate.squid.local/$1 On 16 April 2014 07:26, Jasper Van Der Westhuizen <jvdwesthuiz@xxxxxxxxxxxxxx> wrote: > > > On Tue, 2014-04-15 at 14:38 +0100, Nick Hill wrote: >> URLs with query strings have traditionally returned dynamic content. >> Consequently, http caches by default tend not to cache content when >> the URL has a query string. >> >> In recent years, notably Microsoft and indeed many others have adopted >> a habit of putting query strings on static content. >> >> This could be somewhat inconvenient on days where Microsoft push out a >> new 4Gb update for windows 8, and you have many such devices connected >> to your nicely cached network. Each device will download exactly the >> same content, but with it's own query string. >> >> The nett result is generation of a huge amount of network traffic. >> Often for surprisingly minor updates. >> >> I am currently testing a new configuration for squid which identifies >> the SHA1 hash of the windows update in the URL, then returns the bit >> perfect cached content, irrespective of a wide set of URL changes. I >> have it in production in a busy computer repair centre. I am >> monitoring the results. So far, very promising. > > Hi Nick > > As you rightly said, Windows 8 devices are becoming more and more common > now, specially in the work place. I don't want to download the same 4GB > update multiple times. Would you mind sharing your SHA1 hash > configuration or is it perhaps available somewhere? > > Regards > Jasper