Actually I did find another reason for non-caching of this specific URL - maximum_object_size default is 4 MB, while the object is about 6 MB. While experimenting, I stumbled upon an undocumented requirement - maximum_object_size MUST be placed before cache_dir, Otherwise cacheable check fails with "too big", regardless of maximum_object_size value if the object size is above the default 4 MB. I'm wondering if there are more undocumented precedence/dependencies like this, that can materially impact the cache effectiveness. On Mon, Apr 21, 2014 at 3:35 PM, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote: > On 21/04/2014 11:22 p.m., Timur Irmatov wrote: >> On Mon, Apr 21, 2014 at 2:06 PM, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote: >>> On 21/04/2014 6:56 p.m., Timur Irmatov wrote: >>>> 2014/04/21 11:46:03.940 kid1| ctx: exit level 0 >>>> 2014/04/21 11:46:03.940 kid1| store.cc(1011) checkCachable: >>>> StoreEntry::checkCachable: NO: not cachable >>>> >>>> So Squid considers servers reply uncacheable. Why? >>>> >>> >>> Something (unknown) has marked it to be discarded before it finished >>> arriving. There is no sign of the store lookup logics looking up an >>> existing entry either. >>> And ALL,6 trace (very big) will probaly be needed for that one. >> >> After clearing a cache and enabling ALL,6 trace I have performed >> several requests through my proxy. >> >> Now in cache.log I do see a line "SECURITY ALERT: Host header forgery >> detected". Indeed, guard.cdnmail.ru resolves to different IP addresses >> sometimes. >> >> What are my options now? Is it possible to disable host forgery detection? > > No. It is done to prevent your proxy being hijacked through malicious > web bugs corrupting the cache with infected downloads. > > Imagine what would happen if one of your clients browser was delivered > an "advert" which was actually a script that sent an HTTP request with > URL "http://google.com/" and fetched it directly from the IP of a server > run by the attacker. If that response got cached all your users fetching > Google home page from the proxy would get infected with anything the > attacker wanted to deliver. > >> >> Also, TrafficServer has on option to skip dns lookup and use remote IP >> address from incoming client connection. Is it possible to do the >> same? The idea is to skip double DNS lookup, one by client and one by >> proxy server. > > Squid does this by default. You can see it in the logs earlier: > > Found sources for 'http://guard.cdnmail.ru/GuardMailRu.exe' > always_direct = DENIED > never_direct = DENIED > DIRECT = local=X.X.X.X remote=217.69.139.110:80 flags=25 > > > The remote= value is the IP the client was connecting to. Seems to be a > small bug in the display, it should be saying ORIGINAL_DST instead of > DIRECT. > > >> >>> There are two other obvious things to check. >>> >>> The first is that this request is arriving on the tproxy port and the >>> domain name appears to be using different IPs in geographic based >>> responses. Is the Squid box getting the same 217.69.139.110 destination >>> as the client was contacting? >> >> Yes, as I stated above. >> >>> The second is the storeid helper. What is its output? >>> debug option 84,9 >> >> Storeid helpers does not rewrite this request in any way (replies with ERR). >> > > Okay. > > Amos >