Thank you Amos for the detailed reply! I'm only going to prefetch specific pages (using regexp matching for url patterns) that are pretty static and I'll check squid-prefetch. Best, Jianshi On Tue, May 6, 2014 at 5:04 PM, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote: > On 6/05/2014 5:53 p.m., Jianshi Huang wrote: >> Hi, >> >> I need to build a prefetching proxy to speedup page loading/clicks and >> I'm currently investigating Squid for prototype. Websites I want to >> speedup are all under HTTPS. >> >> I briefly scanned Squid's document and google the keywords, looks like >> I need to do the following setup: >> >> 1) Use SSL-Bump (and install Squid's cert in client's machine) > > Yes this is the only way to get around the HTTPS "problem". > >> 2) Two Squid setup, one runs prefetching script, another one does the caching. > > Any reason for that design? > Prefetching only needs three components: The cache (Squid), the logic > deciding what to fetch (script), and possibly a database of past info to > inform those decisions. > > Check out the squidclient tool we provide for making arbitrary web > requests. It is the best tool around for scripting web requests, similar > levels of control to libcurl which is (probably) the best for use in > compiled code. > > >> Does that make sense? > > Pre-fetching is a very old idea based around metrics from very old > protocols such as HTTP/1.0 where the traffic was static, predictable and > prefetching makes a fair bit of sense. > > However there are several popular features built into HTTP/1.1 protocol > which greatly alter that balance. Dynamic content with variants makes it > far less static. Response negotiation makes the responses far more > unpredictable. Persistent connections greatly reduce the lag times. > Revalidation reduces the bandwidth costs. Together these all make > prefetching in HTTP/1.1 a much less beneficial operation than most of > the literature makes it seem. > > Whether it makes sense depends entirly on where it is being installed, > what the traffic is like, how and why the prefetching decisions are > being made. Only you can really answer those and it may actually take > doing to figure out whether it was a bad choice to begin with. > > >> Is there a better solution? > > At the current point of Internet development I believe throwing efforts > into assisting us with HTTP/2 development would be more beneficial. But > I am somewhat biased being a member of the HTTPbis WG and seeking to get > Squid HTTP/2 support off the ground. > > >> Or has anybody done similar things? > > We get a fairly regular flow of questions from people wanting to do > pre-fetching. They all hit that above issues eventually and drop out of > sight. > > This thread summarizes teh standard problems and answers: > http://arstechnica.com/civis/viewtopic.php?f=16&t=1204579 > (see fandingo's answer near the bottom) > > I am aware of this tool being used by more than a few dozen > installations, although its popularity does seem to be in decline: > https://packages.debian.org/unstable/web/squid-prefetch > > >> >> It would be great if someone could point out some >> configuration/scripting files to me. Code speaks :) >> > > Everythig in this regard is situation dependent. The above are likely to > be the best you can find. People who actually get it going (apparently > anyway) keep the secrets of how to avoid the HTTP/1.1 issues pretty close. > > HTH > Amos > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/