Search squid archive

Re: Re: Caching large files (i.e .ipsw)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Archer,

I analyzed couple of the addresses which seems like the "edgesuite.net" domain is simply a CDN.
From the addresses you sent you can use:
http://appldnld.apple.com/content.info.apple.com/iPod/SBML/osx/bundles/061-2967.20080313.Cnvkg/iPod_25.1.3.ipsw

http://appldnld.apple.com.edgesuite.net/content.info.apple.com/iPod/SBML/osx/bundles/061-2967.20080313.Cnvkg/iPod_25.1.3.ipsw

which are the same exact object my MD5 hash and ETAG.
the only difference I have seen is in the expiration date.
also the xml files is quite an asset:
http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStore.woa/wa/com.apple.jingle.appserver.client.MZITunesClientCheck/version/

These are of the addresses that will response with that same xml:
http://ax.itunes.apple.com/WebObjects/MZStore.woa/wa/com.apple.jingle.appserver.client.MZITunesClientCheck/version/
http://itunes.apple.com/WebObjects/MZStore.woa/wa/com.apple.jingle.appserver.client.MZITunesClientCheck/version

The above means that the above urls can be mirrored using StoreID feature which from the content point of view is 2++ urls that leads to the same object even on the MD5 and ETAG level. The only block in these cases is the header "Cache-Control: max-age=0, no-cache, no-store" which makes these urls "uncachable\friendly".
These are pretty big files...

There is an issue about the 304 responses from the server which should be ignored by default since we do trust a server that response with a 304 on cached file verification.

I would say that there is an option to use these patterns pretty safely:

^http:\/\/([a-z0-9\.]+)\.apple\.com\.edgesuite\.net\/content\.info\.apple\.com\/((iOS|iPhone)[a-zA-Z0-9\/\.\,\_\-]+\.(ipsw|ipd|ipcc))$ http://appledl.squid.internal/$2

^http:\/\/([a-z0-9\.]+)\.apple\.com\/((iOS|iPhone)[a-zA-Z0-9\/\.\,\_\-]+\.(ipsw|ipd|ipcc))$ http://appledl.squid.internal/$2

^http:\/\/([a-z0-9\.]+)\.apple\.com\.edgesuite\.net\/((iOS|iPhone)[a-zA-Z0-9\/\.\,\_\-]+\.(ipsw|ipd|ipcc))$ http://appledl.squid.internal/$2

These are pretty risky ones:
http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStore.woa/wa/com.apple.jingle.appserver.client.MZITunesClientCheck/version/

^http:\/\/([a-z0-9\.]+)\.apple\.com\.edgesuite\.net\/(WebObjects/MZStore.woa/wa/com.apple.jingle.appserver.client.MZITunesClientCheck/version/[a-zA-Z0-9\.\-\_\/\?]*)$ http://appledlxml.squid.internal/$2

^http:\/\/([a-z0-9\.]+)\.apple\.com\/(WebObjects/MZStore.woa/wa/com.apple.jingle.appserver.client.MZITunesClientCheck/version/[a-zA-Z0-9\.\-\_\/\?]*)$ http://appledlxml.squid.internal/$2


All the above will need a refresh_pattern something like this:
refresh_pattern ^http://(appledlxml|appledl)\.squid\.internal/.* 10080 80% 79900 refresh-ims override-expire ignore-reload ignore-private ignore-no-store reload-into-ims

If anyone plans to use this pattern notice that it can lead to some strange behavior of the cache.
You can remove the applexml patterns since they are sensitive.. very..

Also since edgesuite caches these files I would assume you should have pretty fast access to all of these files. In a case you do like these suites and just want to cache what you can you can try to use this refresh_patterns: refresh_pattern ^http://([a-z0-9\.]+)\.apple\.com\.edgesuite\.net\/((iOS|iPhone)[a-zA-Z0-9\/\.\,\_\-]+\.(ipsw|ipd|ipcc))$ 10080 80% 79900 refresh-ims override-expire ignore-reload ignore-private ignore-no-store reload-into-ims

refresh_pattern ^http:\/\/([a-z0-9\.]+)\.apple\.com\.edgesuite\.net\/content\.info\.apple\.com\/((iOS|iPhone)[a-zA-Z0-9\/\.\,\_\-]+\.(ipsw|ipd|ipcc))$ 10080 80% 79900 refresh-ims override-expire ignore-reload ignore-private ignore-no-store reload-into-ims

#end of patterns.
The above are human crafted and partially tested so be careful to make sure I am human and not always mistake but it happens..

If I would have more urls for these domains:
http://swdownload.apple.com
http://swcdn.apple.com

I might be able to find a pattern for them also.

If until now you haven't seen StoreID feel free to look at:
http://wiki.squid-cache.org/ConfigExamples/DynamicContent/Coordinator
http://wiki.squid-cache.org/Features/StoreID

Which will clarify almost all your doubts about what is De-Duplication vs Dynamic-Content.

Also I wanted to write that the mentioned IOS files headers are very nice since they do include Content-MD5 which makes things more easier to verify.

As a sidenote:
An ICAP service that can validate the full content MD5 hash for these specific domains+urls can make a webcache response with more cached objects using couple twists on the way to make sure that squid verification of requests cachiness will be tested by outside logic.

Eliezer

On 11/06/2013 10:49 PM, Archer wrote:
Hopefully these two url's should be of some help;

This link is used by iTunes every time a software update/restore is done.

http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStore.woa/wa/com.apple.jingle.appserver.client.MZITunesClientCheck/version/

The content on this server only added to and almost never changed, so I'm
hoping i can tell squid that all content (except the initial xml file) is
alway fresh so that it is never deleted.

i.e.
http://appldnld.apple.com/iOS6.1/091-2397.20130319.EEae9/iPad2,1_6.1.3_10B329_Restore.ipsw
http://appldnld.apple.com/iOS7/031-1020.20131022.14lik/iPad2,1_7.0.3_11B511_Restore.ipsw

when new software is brought out, it is simply added to the list rather than
old software being removed.



The following links are used for OS X software updates:

http://swdownload.apple.com
http://swcdn.apple.com

Honestly, I'm not entirely sure how these ones work, but i suspect it is
fairly similar to the iOS one above.



--
View this message in context:http://squid-web-proxy-cache.1019090.n4.nabble.com/Caching-large-files-i-e-ipsw-tp4662838p4663155.html
Sent from the Squid - Users mailing list archive at Nabble.com.





[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux