Amos, I thought CVE-2009-0801 was a little bit about other thing. Now, after researching it in more detail, I believe I understand what you mean. For the sake of other people investigating the same issues, CVE-2009-0801 overview: Squid, when transparent interception mode is enabled, uses the HTTP Host header to determine the remote endpoint, which allows remote attackers to bypass access controls for Flash, Java, Silverlight, and probably other technologies, and possibly communicate with restricted intranet sites, via a crafted web page that causes a client to send HTTP requests with a modified Host header. So the problem is that if a client tries to connect (could be a direct forgery by the client or some webpage generated one) to say, host1.rogue.xxx, and it resolves to 192.168.0.2 (our local server with sensitive information), the request fulfillment with the patch would look like this: 1. Client (who's IP is 192.168.2.20/255 - no direct connection to the sensitive server) tries to establish an outgoing connection via the router/firewall to a random or well-known IP (e.g. 189.190.191.192 or an IP of google.com) 2. Firewall checks the rules and finds that it's a public IP so it allows the connection and forwards it to the squid 3. Squid checks the IP vs. Host, finds a forgery attempt and proceeds to the patch code 4. There it picks the IP of the resolved Host and tries to connect to the sensitive server 5. It will succeed at this IF there is no firewall checking behind squid If I understand it correctly, the only possible attack scenario is when a squid server has unrestricted access to sensitive information and at the same time the affected client does NOT have a direct access to it (if the client DOES have it, he doesn't even go thru the proxy). Looks like an incomplete firewall setup to me. Also please take a look at the comments of this article (one of the original articles about the issue): http://forums.theregister.co.uk/forum/1/2009/02/23/serious_proxy_server_flaw /. They almost all agree it's a very UNcommon attack vector. So, if the risk is so narrow and can easily be mitigated by a correct firewalling of squid's outgoing connections (that should be done in any case, think of some unknown vulnerability in squid and its unrestricted access to sensitive information; also why should squid have access to intranet at all?), why not to make an option (possibly with the default value set to off) to permit this functionality? I clearly prefer to have a rock-solid caching and one additional firewall rule than to have unpredictable cache behavior and lots of cacheable traffic going to the external world. I believe the only risk with this issue is when squid admins are unaware of it. To mitigate this there could be explicit warnings in the documentation and at the squid's startup output (maybe at -d 2, when this option is turned on) that this option requires some firewalling of squid's outgoing access. The main problem is that (at least in my setup) the unpatched version doesn't even work with windows updates sometimes, not to mention the maxobjsize issue with almost no caching at all. When I installed squid for the first time 3 weeks ago (solid caching was needed for one infrastructure with bandwidth issues) and saw how low the hit rate was (like 10% of the traffic) I thought that maybe it wasn't worth the time spent on its correct configuration. Now, with more than 90% of traffic cached it's clearly a win. And my assumption is that a lot of squid users are in a similar situation. The rationale is that there are a lot of ways to secure squid, but without this patch there are no ways for stable caching. ------- With respect to my security note about the patch, I wanted to make it clear that it's just a working concept of the idea, so if the developers find the idea applicable, they would need to perform their own security checks as I'm not an official contributor and may be unaware of some particular implementation issues, and the sysadmins that decide to use it in production before an official release are doing so at their own risk. ------- With respect to the maximum_object_size, 'make' creates src/cf_parser.cci file which has calls to config options processing like: default_line("maximum_object_size 4 MB"); ... if (!strcmp(token, "cache_dir")) { cfg_directive = "cache_dir"; parse_cachedir(&Config.cacheSwap); cfg_directive = NULL; return 1; }; if (!strcmp(token, "store_dir_select_algorithm")) { cfg_directive = "store_dir_select_algorithm"; parse_string(&Config.store_dir_select_algorithm); cfg_directive = NULL; return 1; }; ... if (!strcmp(token, "maximum_object_size")) { cfg_directive = "maximum_object_size"; parse_b_int64_t(&Config.Store.maxObjectSize); cfg_directive = NULL; return 1; }; parse_cachedir is defined in src/cache_cf.cc at line 1914 and at line 1958 it has a call to update_maxobjsize() which limits the store_maxobjsize variable (the internal maximum_object_size variable of the store data structure) to the value of maximum_object_size defined at the moment of execution of this function, for all stores (all store directories). So if parse_cachedir is called before parse_b_int64_t(&Config.Store.maxObjectSize), we get the effect of the default_line("maximum_object_size 4 MB"). BUT, when we get to the parse_b_int64_t(&Config.Store.maxObjectSize) line, the option is processed and shown at the cachemgr config page. The src/cf_parser.cci file is generated by src/cf_gen.cc, which is compiled and then called by make. The compiled src/cf_gen.cc takes all the instructions to generate src/cf_parser.cci (as well as the squid.conf.documented and squid.conf.default files) from src/cf.data.pre. And the order of initialization functions in src/cf_parser.cci depends on the order of the config entries in src/cf.data.pre. A very simple fix would be to move cache_dir entry in src/cf.data.pre to the end of the file, but this will also affect the generated squid.conf.documented and squid.conf.default files (nothing serious compared to not processing correctly the maximum_object_size, but not a clean solution). A better solution, I believe, would be to group all related options in src/cf.data.pre and sort them according to the processing dependencies. For the cache-related options we should group them and then place the cache_dir entry at the end of this group. This way the documentation stays logically grouped/sorted and the maximum_object_size problem is fixed. Regards, Anatoli -----Original Message----- From: Amos Jeffries [mailto:squid3@xxxxxxxxxxxxx] Sent: Tuesday, April 22, 2014 01:44 To: squid-users@xxxxxxxxxxxxxxx Subject: Re: Strange misses of cacheable objects [SOLVED] On 22/04/2014 11:04 a.m., Anatoli wrote: > OK, found the problem. All the "problematic" objects are from multi-IP > domains and sometimes the browser resolves them and sends the request to an > IP that is not in the list (this is for intercept mode). > > So, in the browser with http_watch I see that the request for > http://www.googleadservices.com/pagead/conversion_async.js is sent to > 173.194.118.122, but in nslookup with set debug option I see: > > Name: pagead.l.doubleclick.net > Addresses: 173.194.118.45 > 173.194.118.58 > 173.194.118.57 > Aliases: www.googleadservices.com > > The IP resolved by the browser is not in the list! > > So, squid interprets this as a destination IP forgery and doesn't cache the > response. This behavior is documented at host_verify_strict option. By > default it's set to off, that's why it's difficult to discover the reason. > If you set it to on and try to download a problematic object, squid will > return URI Host Conflict (409 Conflict) and in the access.log you'll see > TAG_NONE/409 (additionally, with increased debug levels, you'll also see > security alerts). The beta releases optimistically had strict verification enabled by default. Sadly, we had to disable it by default due to a high number of issues seen with Google and Akamai hosted sites. > > This should partly explain the numerous complaints about more-than-expected > misses. > > This is actually a problem, as the IP mismatches are not due to an > artificially crafted request, but a normal functioning of the DNS and > different levels of its caching. The reason for IP mismatch should be the > frequency of DNS updates for these multi-IP domains. Actually, you can see > with nslookup in debug mode that www.googleadservices.com has the default > TTL of just 5 min, cdn.clicktale.net of 2 min, google.com of 1 min 25 sec > and global.ssl.fastly.net of 25 sec. When I restart DNS Client service, I > get a HIT from squid for almost all of the originally published problematic > objects without any security alerts, until the IP discrepancies start to > appear again. > > So, it looks like the destination IP forgery check should be relaxed somehow > (for example, with /24 mask as the majority of the mismatches in the IPs are > in the last octet) or squid should cache for a long time all the IPs for all > the domains, just for this forgery check. > Unfortunately we are walking a very thin line in the security already between safe/unsafe actions. NP: It took over 2 years with multiple people getting involved and counter-checking each other on use-cases and testing on live traffic to reach the state we have today. So do not be discouraged by what I'm about to say below. > Another (at least as a temporary workaround) option would be to disable this > check completely as it actually poses very little risk for a correctly > configured squid with trusted clients. At the same time, an untrusted client > could request a virus for some known file via his own host and make squid > this way cache and distribute an infected file to the rest of the clients. This is not an option. The biggest hurdle resolving this vulnerability is that *all* clients can be hijacked or subverted - so there are no trusted clients at all. > > The best option, I think, would be for the requests considered as forgery to > overwrite the destination IP provided by the client with one of the resolved > IPs for the domain in the Host field (like with client_dst_passthru off). > Doing this action is the vulnerability described in CVE-2009-0801. Any client can send a forged Host header and cause the proxy to resolve the IP to be a different one. Bypassing *firewall* IP-level protections. How do you know the Host header contains accurate data? There are only two guarantees: 1) that the client was *definitely* fetching from the TCP IP:port. 2) that the IP:port in #1 does *not* match the server DNS records. The implication are that this is either a hijacking, or the server moved. > And here is a patch for this. Please note I haven't done extensive security > issues verifications, Please do that before posting patches to bypass security restrictions. Particularly security restrictions which are so obviously annoying to many people. We don't exactly like being annoying so there is always a good reason for it when we are. <snip> > > After applying this patch the hit rate increased significantly for all types > of objects, not only for those that match refresh_pattern options. No more > random misses, than hits, then misses again. NOTE: All clients behind your network are now vulnerable to a 15 line javascript, or 6 line flash applet which can be embeded in any web page. All it takes is one client with scripting enabled to run it and the entire network is hijacked. As you found already at least one of the major sources of verification failures is an advertising service (googleadservices). Given that ad services commonly present scriptlets written by unknown third-parties... There are infections out there which use this vulnerability. Also a forwarding loop DoS is just as easy to trigger as cache corruption and has far more immediate side effects - this effect is used by at least one security scanning software (by Trend Micro) to detect vulnerable proxies [by crashing them]. Since you seem to have the ability to find and make patches: The only way we know of to safely cache these files is to add the destination IP+port of the server where the object was fetched to the cache key. That is expected to raise the HIT ratio somewhat by allowing "bad" clients to get HITs without corrupting anything for "good" clients. Lack of time to focus on it has been the main blocker in adding that. Note this will still cause some extra MISS when the DNS used by Squid and the client are out of sync - as the "bad" objects get cached one for each untrusted origin. Also Note that the verification should not place any restrictions to HIT on content already in the cache. A "bad" fetch can safely be delivered a HIT cache by an earlier "good" fetch. So sites which are cache friendly to begin with have a much reduced likelihood of encountering a MISS from this problem even if the do move IPs. Unfortunately there are prices to be paid for violating protocols (in this case TCP). Extra MISS on some traffic is one of them. Just like losing the ability to authenticate users. > > Still, the adobe .exe file was not caching. So I decided to continue the > investigations and finally found what the problem was. > > With adequate debug_options enabled, squid was saying that the object size > was too big (I've added the CL (Content-Length), SMOS (store_maxobjsize) and > EO (endOffset) variables to the log line). > > 2014/04/21 00:35:35.429| store.cc(1020) checkCachable: > StoreEntry::checkCachable: NO: too big (CL = 33560984; SMOS = 4194304; EO = > 268) > > Clearly, something was wrong with the maxobjsize, that was set in the config > to 1Gb and the log was reporting it being set to 4Mb (what I discovered > later to be the default value). > > After some additional research, I found that in the src/cf_parser.cci file > (generated by make) there are 2 calls to the configuration initialization > functions for almost all the configuration options - the first one is for > the predefined (default) values and the second one for the config file > values. There is a function parse_cachedir (defined in src/cache_cf.cc) that > initializes the store data structure with the options related to the store > (like maxobjsize), and it is called when the config parser finds cache_dir > option in the config and it's not called again when it finds all other cache > related options. So, if you put in your config something like this (like it > was in mine): > > cache_dir aufs /var/cache 140000 16 256 > maximum_object_size 1 GB > > then the maximum_object_size option is processed and you see it at the > cachemgr config page but it has no effect as the store data structure > parameter maxobjsize was already initialized (with the default value) by > parse_cachedir before parsing the "maximum_object_size 1 GB" line, so we > have 4Mb (default) effective maximum_object_size. > > If we have a config with > > maximum_object_size 1 GB > cache_dir aufs /var/cache 140000 16 256 > > we get the effective maximum_object_size for the store set to 1Gb as > expected. Aha. Thank you for tracking this one down. That is a behaviour we have been looking for for a while. I'm still a little unfamilar with the store internals though. Can you please point me at the place you found the early initialization being done? > > There are warnings in the documentation that the order of config options is > important, but it is only explained in the context of ACLs and other > unrelated settings. In my opinion, this is a huge problem as it is nothing > obvious what should precede what. There should be at least a note in the > documentation for each option affected by the order of config processing and > there should be a final "all effective values" output at squid > initialization (maybe with -d 2 and higher) and of course cachemgr config > page should show correct (effective) values. Some people complain that dumping over 16KB to the logs (possibly syslog) on each daemon startup is a bit unfriendly. The cachemgr "config" report should contain all finalized configuration settings. Unfortunately that does not show toggle-like and repeated configuration values nicely. If it is showing anything inacurate for the cache dir max-size= parameters that is a bug that needs fixing. > > Now it is: > maximum_object_size @ cachemgr config page: 2147483648 bytes > Effective maximum_object_size: 4194304 bytes > > And a better solution would be to call parse_cachedir (and similar > functions) at the end of the config file processing (an extremely simple fix > in the src/cf_parser.cci generation). FYI: parse_*() and similar *are* the config file processing. > > Now, with the patch and the "correct" order of maximum_object_size and > cache_dir (put cache_dir after all the cache-related options, including > memory cache ones), all "problematic" objects are cached as expected and > there is a huge (like 10-fold on average and more than 100-fold for WU and > similar) increase in the hit rate. Rock-solid caching! > > Regards, > Anatoli > Cheers Amos