Search squid archive

Re: Dynamic/CDN Content Caching Challenges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Amos,
As you mentioned "Better to Store-ID cache the thing its Location header is pointing to." The problem is Location header has random strings in the URL that caused unique URL for the same object. Location: http://fs37.filehippo.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe

Random string in the URL "/9546/46cfd241f1da4ae9812f512f7b36643c"

I was trying to deal with this situation.

--
Regards,
Faisal.



------ Original Message ------
From: "Muhammad Faisal" <faisalusuf@xxxxxxxxx>
To: "Amos Jeffries" <squid3@xxxxxxxxxxxxx>; squid-users@xxxxxxxxxxxxxxxxxxxxx
Sent: 4/14/2016 4:21:16 PM
Subject: Re:  Dynamic/CDN Content Caching Challenges

Thanks i will keep grinding on other websites. Currently working on streaming videos to be served from Cache. I'm a bit confuse on cache hit reason why its miss is it because of 206 or some other reason:

TCP_MISS/206 3874196 GET http://cw002.foo.net/files/videos/2015/12/30/145148227265e28-360.mp4 - ORIGINAL_DST/a.b.c.d video/mp4

Im trying with the regexp with store-ID helper to be served from the cache and save it as single object because [cw002] could change and will result in a different object.

so my understanding with storeid helper to deal with those objects which are similar but originating from different hosts is correct?
http:\/\/(cws[0-9]+)\.foo.net\/files\/videos\/.*\/.*\/(.*\.mp4)

to store as http://cdn.foo.net/"; . $1



--
Regards,
Faisal.



------ Original Message ------
From: "Amos Jeffries" <squid3@xxxxxxxxxxxxx>
To: "Muhammad Faisal" <faisalusuf@xxxxxxxxx>; squid-users@xxxxxxxxxxxxxxxxxxxxx
Sent: 4/14/2016 3:59:14 PM
Subject: Re:  Dynamic/CDN Content Caching Challenges

On 14/04/2016 9:32 p.m., Muhammad Faisal wrote:
 Thanks Amos for a detailed response.
Well for Squid we are redirecting only HTTP traffic from policy routing.
 The object is unique which is being served to clients but due to
 different redirection of every user a new object is stored.

What about http streaming content having 206 response code how to deal with it? afaik squid dont cache 206 partial content. Is this correct?

Squid does not cache 206 from the server. But a HIT served by Squid can
be 206 status.


 e.g filehippo below is the sequence:

 When I click download button there are two requests one 301 which
 contains (Location header for the requested content) and second 200:

 301 Headers: ?

 GET
/download/file/6853a2c840eaefd1d7da43d6f2c94863adc5f470927402e6518d70573a99114d/
 HTTP/1.1
 Host: filehippo.com
 Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
 Accept-Encoding: gzip, deflate, sdch
 Accept-Language: en-US,en;q=0.8
 Cookie: FHSession=mfzdaugt4nu11q3yfxfkjyox;
 FH_PreferredCulture=l=en-US&e=3/30/2017 1:38:22 PM;
 __utmt_UA-5815250-1=1; __qca=P0-1359511593-1459345103148;
 __utma=144473122.1934842269.1459345103.1459345103.1459345103.1;
 __utmb=144473122.3.10.1459345119355; __utmc=144473122;
__utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);
 __utmv=144473122.|1=AB%20Test=new-home-v1=1
 Referer:
http://filehippo.com/download_vlc_64/download/56a450f832aee6bb4fda3b01259f9866/

 Upgrade-Insecure-Requests: 1
 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36
 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36

 HTTP/1.1 301 Moved Permanently
 Accept-Ranges: bytes
 Age: 0
 Cache-Control: private
 Connection: keep-alive
 Content-Length: 0
 Content-Type: text/html
 Date: Wed, 30 Mar 2016 13:38:45 GMT
 Location:
http://fs37.filehippo.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe

 Via: 1.1 varnish
 X-Cache: MISS
 X-Cache-Hits: 0
 x-debug-output: FHSession=mfzdaugt4nu11q3yfxfkjyox;
 FH_PreferredCulture=l=en-US&e=3/30/2017 1:38:22 PM;
 __utmt_UA-5815250-1=1; __qca=P0-1359511593-1459345103148;
 __utma=144473122.1934842269.1459345103.1459345103.1459345103.1;
 __utmb=144473122.3.10.1459345119355; __utmc=144473122;
__utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);
 __utmv=144473122.|1=AB%20Test=new-home-v1=1
 X-Served-By: cache-lhr6334-LHR


Ew. Borked server. 302 may be old but there are situations (this being
one) where it actually is appropriate to respond with a temporary status.

It also seems to contain an amateur attempt at cache-optimization by
someone who does not understand what middleware does.


You could technically force this to cache. But its not worth it. Let the
site admin who made that yucky response deal with the 2x latency cost
they created. Better to Store-ID cache the thing its Location header is
pointing to.


200 Header: Why ATS is not caching octet stream despite having CONFIG
 proxy.config.http.cache.required_headers INT 1

Squid is not ATS. The 301 response above is CC:private so only the
receiving browser is allowed to cache it. What was the question?

GET /9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe HTTP/1.1
 Host: fs37.filehippo.com

What do you know about the components of that URL...

* What does "9546" mean;
 - just a random number?
 - some form of customer-ID videolan have with Filehippo ?
 - some form of category ID that represents VLC software type etc?

* What does the long random looking hex number mean;
 - just a random visitor session ID?
 - the hash sum for the VLC binary being fetched?

... or something else?

try some manual requests with different values and see what happens to
the response. Pay particular attention to the ETag response header, its size, and if you want to be paranoid take the SHA1 and MD5 hashes of the
response object when it looks like it should be identical.

Check your logs for patterns in the URLs and test in teh same ways the
other files you find people fetching.

If that checks out then you know what your Store-ID pattern can drop and
what needs to be kept.

This is the hard way, and a "lot of work" as I mentioned earlier. If you
want to help the community then please contribute back by putting your
findings into the wiki Store-ID database pages so all that work does not
go to waste.


 HTTP/1.1 200 OK
 Accept-Ranges: bytes
 Age: 739
 Connection: keep-alive
 Content-Length: 31367109
 Content-Type: application/octet-stream
 Date: Wed, 30 Mar 2016 13:26:43 GMT
 ETag: "81341be3a62d11:0"
 Last-Modified: Mon, 08 Feb 2016 06:34:21 GMT


Amos


_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux