Re: Re: Accelerating Proxy options?

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Wed, 20 Apr 2011 12:59:47 +1200

On Tue, 19 Apr 2011 13:31:38 -0700, Linda Walsh wrote:
Amos Jeffries wrote:
On Mon, 18 Apr 2011 18:30:51 -0700, Linda Walsh wrote:
[wondering about squid accelerator features such as...]
1)  Parsing fetched webpages and looking for statically included 
content
 and starting a "fetch" on those files as soon as it determines
 page-requisites
Squid is designed not to touch the content. Doing so makes things 
slower.
----
   Um, you mean:  "Doing so can often make things slower."   :-)

Almost always. ;-)

   It depends on the relative speed of CPU speed (specifically, the
CPU speed of the processor where squid is being run) vs. the external
line speed.   Certainly, you would agree that if the external line
speed is 30Bps, for example, Squid would have much greater latitude
to "diddle" with the content before a performance impact would be 
noticed.

   I would agree that doing such processing "in-line" would create
a performance impact, since even right now, with no such processing 
being
done, I note squid impacting performance by about 10-30% over a 
direct
connection to *fast* sites.  However,  I would only think about doing
such work outside of the direct i/o chain via separate threads or 
processes.

Not easily possible in Squid. The I/O chain for the body is currently 
almost exactly read FD 1 -> write FD 2.
Doing any changes at all inline means making it:
 read FD 1 -> buffer -> scan -> process -> copy result -> write FD 2.

About 2x-3x delay even if there is no change to be made.

We get away with ~5% lag from chunked encoding because the chunks are 
predictable in advance and the intermediate bytes can drop down to that 
read->write efficiency within chunks.

   Picture this: I (on a client sys) pull in a web page.  At same 
time
I get it, it's handed over to a separate process running on a 
separate core

read -> copy to reader thread buffer -> copy to processing thread 
buffer -> copy to result output buffer (maybe) -> coy to writer thread 
buffer -> write.

2x slowdown *on top of* the above processing scan lags. This exact case 
of multiple copying is one of two reasons we do not have threading in 
Squid.
 We are working instead towards the Apache model of one process fully 
handling a request transaction with IPC callouts to linked workers which 
can provide shared details as needed. Threads possibly at the 
sub-transaction layer to handle things, decide on a case-by-case basis. 
More in the wiki under SmpSupport.

that begins processing.  Even if the server and client parse at the 
same
speed, the server would have an edge in formulating the "pre-fetch"
requests simple because it's on the same physical machine and doesn't
have any client-server latency).  The server might have an additional
edge since it would only be looking through fetched content for
"pre-fetchables" and not concerning itself with rendering issues.

There are ICAP server apps and eCAP modules floating around that 
people have written to plug into Squid and do it. The only public one 
AFAICT is the one doing gzipping, the others are all proprietary or 
private projects.
---
  Too bad there is no "CSAN" repository akin to perl's CPAN as well
as a seemingly different level of community motivation to adding to 
such
a repository.

2. Another level would be pre-inclusion of included content for 
pages
that have already been fetched and are in cache.  [...]
ESI does this. But requires the website to support ESI syntax in the 
page code.
---
  ESI?  Is there a TLA URL for that? ;-)

It is mostly server-side stuff.
http://en.wikipedia.org/wiki/Edge_Side_Includes covers the ESI syntax 
etc.

When the component is built into Squid (--enable-esi) it is pretty much 
automatic on reverse-proxy requests.

The only applicable config in Squid is 
http://wiki.squid-cache.org/Features/Surrogate and for all 
well-configured reverse-proxy the visible_hostname (being a unique 
public FQDN) is the default advertised surrogate ID anyway.

  Anyway, just some wonderings...
    What will it take for Sq3 to get to the feature level of Sq2 and 
allow,

What we are missing in a big way is store-URL and location-URL 
re-writers.

And speed ... we have a mixed bag of benchmarks for the relative speed 
of 3.2 and 2.7. They appear pretty much equal or 3.2 ahead for some 
simple tests now. Some common components (ie ACLs and DNS) need a bit 
more speed optimization before 3.2 is ahead in general.

ETag variants, collapsed forwarding, and background revalidation 
(leading to stale-while-revalidate support) would be nice to improve 
speed, but are not essential for the deprecation of 2.7.

for example, caching of dynamic content?

Squid all have that. It was only ever a configuration default. Though 
Squid HTTP/1.1 caching compliance is dodgy with releases mid-series 2.6 
and older.

  Also, what will it take for Sq3 to get full, included HTTP1.1 
support?

Squid-3.1 does HTTP/1.1 to servers by default. Squid-3.2 to clients by 
default as well (AND full chunked encoding support for persistent 
connections).

  It __seems__ like, though it's been out for years, it hasn't made
much progress on those fronts.  Are they simply not a priority?

 Especially getting to the 1st goal (Sq3>=Sq2), I would think, would
consolidate community efforts at improvement and module construction
(e.g. caching dynamic content like that from youtube and the
associated wiki directions for doing so under Sq2, which are
inapplicable to Sq3)...

I've had a handful of people stick their hands up to do this over the 
last year or two. Pointed them at the squid-2 patches which need 
adjusting to compile and work in squid-3 code. Never to hear from them 
again. :(

(chomping at bit, for Sq2 to become obviated by Sq3)...

Me too.

Amos