Search squid archive

Re: Inconsistent accessing of the cache, craigslist.org images, wacky stuff.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Amos Jeffries [mailto:squid3@xxxxxxxxxxxxx]
> Sent: Wednesday, October 28, 2015 10:31 AM
> To: Jester Purtteman <jester@xxxxxxxxxxx>; squid-users@lists.squid-
> cache.org
> Subject: Re:  Inconsistent accessing of the cache, craigslist.org
> images, wacky stuff.
> 
> On 29/10/2015 4:06 a.m., Jester Purtteman wrote:
> >
> >
> >> -----Original Message-----
> >> From: squid-users [mailto:squid-users-bounces@xxxxxxxxxxxxxxxxxxxxx]
> >> On Behalf Of Amos Jeffries
> >> Sent: Tuesday, October 27, 2015 9:07 PM
> >> To: squid-users@xxxxxxxxxxxxxxxxxxxxx
> >> Subject: Re:  Inconsistent accessing of the cache,
> >> craigslist.org images, wacky stuff.
> >>
> >> On 28/10/2015 2:05 p.m., Jester Purtteman wrote:
> >>> So, here is the problem:  I want to cache the images on craigslist.
> >>> The headers all look thoroughly cacheable, some browsers (I'm
> >>> glairing at you
> >>> Chrome) send with this thing that requests that they not be
> >>> cachable,
> >>
> >> "this thing" being what exactly?
> >>
> > Thing -> rest of the request, (you'd think someone who spoke a
> > language their entire life could use it, but clearly still need
> > practice :)
> >
> >> I am aware of several nasty things Chrome sends that interfere with
> >> optimal HTTP use. But nothing that directly prohibits caching like you
> describe.
> >>
> >
> > The chrome version of the headers have two lines that make my eye
> twitch:
> >
> >  Cache-Control: max-age=0
> >  Upgrade-Insecure-Requests: 1
> >
> > Which (unless I don't understand what's going on, which is quite possible)
> means "I don't want the response cached, and if possible, could we securely
> transfer this picture of an old overpriced tractor?  It's military grade
> intelligence information here that bad guys are trying to steal".  Am I
> interpreting that wrong?
> >
> 
> max-age=0 from client means "dont use whatever you have cached. Always
> go to the server for new content.".
> The rest you got.
> 
> Are you using the reload or refresh button in your testing? that is expected
> to cause that max-age value.
> 
> If it is just sending that all the time anyway we will need to update our hacks.
> 

Refresh button, so that at least makes sense.

> 
> >
> > So getting lazy and using 8.8.8.8 because I don't have to remember which
> server I installed bind or dnsmasq on has finally come back to haunt me...  I
> actually had a nightmare of a time getting another system working over the
> same problem, I'm giving this a rating of highly plausible.  I'll revise the
> structure, if that fixes the issue, I'll let you know.
> >
> 
> Introducing auto-configuration :-)
> 
> You don't have to configure Squid with dns_nameservers at all these days. If
> you omit it entirely Squid will load the systems resolv.conf and use whatever
> resolver(s) are in there.
> 
> 

The problem was that the /etc/resolv.conf was also pointing to different name-servers for my squid cache than the clients were.  Once I gave them all the same name server, everything else fell into step, and I discovered that because... read on

> >>>
> >>> So, big question, what debug level do I use to see this thing making
> >>> decisions on whether to cache, and any tips anyone has about this
> >>> would be appreciated.  Thank you!
> >>
> >> debug_options 85,3 22,3
> >>
> >
> > I have used 22,3 which I gleaned from another post on this list, I find a lot of
> this in my cache.log:
> >
> >  2015/10/27 18:23:18.402| ctx: enter level  0:
> 'http://images.craigslist.org/00707_cL1v48AjUBR_300x300.jpg'
> > 2015/10/27 18:23:18.402| 22,3| http.cc(328) cacheableReply: NO because
> e:=p2XDIV/0x24afa00*3 has been released.
> > 2015/10/27 18:23:18.409| ctx: exit level  0
> >
> > I'll let you know if fixing DNS takes that out.
> >
> 
> Hmm. I'm interested now. Will look that up when I have time later.
> 
> Amos

So, after I read your first reply, I responded with a quick snippet of log file that came from the logging level 85,3, it looked like:

"""QUOTEING ANOTHER EMAIL"""
2015/10/28 09:16:54.075| 85,3| client_side_request.cc(532) hostHeaderIpVerify: FAIL: validate IP 208.82.238.226:80 possible from Host:
2015/10/28 09:16:54.075| 85,3| client_side_request.cc(543) hostHeaderVerifyFailed: SECURITY ALERT: Host header forgery detected on local=208.82.238.226:80 remote=192.168.2.56 FD 20 flags=17 (local IP does not match any domain IP) on URL: http://seattle.craigslist.org/favicon.ico
""" END QUOTE """

Based on my reading of http://wiki.squid-cache.org/KnowledgeBase/HostHeaderForgery it appears this is actually intended behavior.  That also explains why it is being released and was found non-cacheable.

So, I just installed dnsmasq on two of my servers, pointed my clients toward that address, and so far it is working a whoel lot better.  My hit rate is up in the 10% range, and that is with a nearly empty cache, so that may be the trick.  I only made the change about a short time ago.  More importantly, that error in the log has gone away and I am getting consistent caching behavior, so that is huge.

Thank you!

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux