Search squid archive

Re: Re: Accelerating Proxy options?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 19 Apr 2011 13:31:38 -0700, Linda Walsh wrote:
Amos Jeffries wrote:
On Mon, 18 Apr 2011 18:30:51 -0700, Linda Walsh wrote:
[wondering about squid accelerator features such as...]
1) Parsing fetched webpages and looking for statically included content
 and starting a "fetch" on those files as soon as it determines
 page-requisites
Squid is designed not to touch the content. Doing so makes things slower.
----
   Um, you mean:  "Doing so can often make things slower."   :-)

Almost always. ;-)


   It depends on the relative speed of CPU speed (specifically, the
CPU speed of the processor where squid is being run) vs. the external
line speed.   Certainly, you would agree that if the external line
speed is 30Bps, for example, Squid would have much greater latitude
to "diddle" with the content before a performance impact would be noticed.

   I would agree that doing such processing "in-line" would create
a performance impact, since even right now, with no such processing being done, I note squid impacting performance by about 10-30% over a direct
connection to *fast* sites.  However,  I would only think about doing
such work outside of the direct i/o chain via separate threads or processes.

Not easily possible in Squid. The I/O chain for the body is currently almost exactly read FD 1 -> write FD 2.
Doing any changes at all inline means making it:
 read FD 1 -> buffer -> scan -> process -> copy result -> write FD 2.

About 2x-3x delay even if there is no change to be made.

We get away with ~5% lag from chunked encoding because the chunks are predictable in advance and the intermediate bytes can drop down to that read->write efficiency within chunks.


Picture this: I (on a client sys) pull in a web page. At same time I get it, it's handed over to a separate process running on a separate core

read -> copy to reader thread buffer -> copy to processing thread buffer -> copy to result output buffer (maybe) -> coy to writer thread buffer -> write.

2x slowdown *on top of* the above processing scan lags. This exact case of multiple copying is one of two reasons we do not have threading in Squid. We are working instead towards the Apache model of one process fully handling a request transaction with IPC callouts to linked workers which can provide shared details as needed. Threads possibly at the sub-transaction layer to handle things, decide on a case-by-case basis. More in the wiki under SmpSupport.

that begins processing. Even if the server and client parse at the same
speed, the server would have an edge in formulating the "pre-fetch"
requests simple because it's on the same physical machine and doesn't
have any client-server latency).  The server might have an additional
edge since it would only be looking through fetched content for
"pre-fetchables" and not concerning itself with rendering issues.

There are ICAP server apps and eCAP modules floating around that people have written to plug into Squid and do it. The only public one AFAICT is the one doing gzipping, the others are all proprietary or private projects.
---
  Too bad there is no "CSAN" repository akin to perl's CPAN as well
as a seemingly different level of community motivation to adding to such
a repository.




2. Another level would be pre-inclusion of included content for pages
that have already been fetched and are in cache.  [...]
ESI does this. But requires the website to support ESI syntax in the page code.
---
  ESI?  Is there a TLA URL for that? ;-)


It is mostly server-side stuff.
http://en.wikipedia.org/wiki/Edge_Side_Includes covers the ESI syntax etc.

When the component is built into Squid (--enable-esi) it is pretty much automatic on reverse-proxy requests.

The only applicable config in Squid is http://wiki.squid-cache.org/Features/Surrogate and for all well-configured reverse-proxy the visible_hostname (being a unique public FQDN) is the default advertised surrogate ID anyway.



  Anyway, just some wonderings...
What will it take for Sq3 to get to the feature level of Sq2 and allow,

What we are missing in a big way is store-URL and location-URL re-writers.

And speed ... we have a mixed bag of benchmarks for the relative speed of 3.2 and 2.7. They appear pretty much equal or 3.2 ahead for some simple tests now. Some common components (ie ACLs and DNS) need a bit more speed optimization before 3.2 is ahead in general.

ETag variants, collapsed forwarding, and background revalidation (leading to stale-while-revalidate support) would be nice to improve speed, but are not essential for the deprecation of 2.7.

for example, caching of dynamic content?

Squid all have that. It was only ever a configuration default. Though Squid HTTP/1.1 caching compliance is dodgy with releases mid-series 2.6 and older.


Also, what will it take for Sq3 to get full, included HTTP1.1 support?

Squid-3.1 does HTTP/1.1 to servers by default. Squid-3.2 to clients by default as well (AND full chunked encoding support for persistent connections).


  It __seems__ like, though it's been out for years, it hasn't made
much progress on those fronts.  Are they simply not a priority?

 Especially getting to the 1st goal (Sq3>=Sq2), I would think, would
consolidate community efforts at improvement and module construction
(e.g. caching dynamic content like that from youtube and the
associated wiki directions for doing so under Sq2, which are
inapplicable to Sq3)...

I've had a handful of people stick their hands up to do this over the last year or two. Pointed them at the squid-2 patches which need adjusting to compile and work in squid-3 code. Never to hear from them again. :(

(chomping at bit, for Sq2 to become obviated by Sq3)...

Me too.

Amos



[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux