Search squid archive

Re: reverse proxy filtering?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff Sadowski wrote:
On Sat, Apr 18, 2009 at 10:24 PM, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote:
Jeff Sadowski wrote:
On Sat, Apr 18, 2009 at 5:18 PM, Amos Jeffries <squid3@xxxxxxxxxxxxx>
wrote:
Jeff Sadowski wrote:
I'm new to trying to use squid as a reverse proxy.

I would like to filter out certain pages and if possible certain words.
I installed perl so that I can use it to rebuild pages if that is
possible?

My squid.conf looks like so
<==== start
acl all src all
http_port 80 accel defaultsite=outside.com
cache_peer inside parent 80 0 no-query originserver name=myAccel
acl our_sites dstdomain outside.com
aha, aha, ..

http_access allow all
eeek!!
I want everyone on the outside to see the inside server minus one or
two pages. Is that not what I set up?
By lucky chance of some background defaults only, and assuming that the web
server is highly secure on its own.

If you have a small set of sites, such as those listed in "our_sites" then
its best to be certain and use that ACL for the allow as well.

 http_access allow our_sites
 http_access deny all

... same on the cache_peer_access below.

cache_peer_access myAccell all
<==== end

how would I add it so that for example

http://inside/protect.html

is blocked?
http://wiki.squid-cache.org/SquidFaq/SquidAcl
so I want redirector_access?
Is there an example line of this in a file

I tried using

url_rewrite_program c:\perl\bin\perl.exe c:\replace.pl

but I guess that requires more to use it? an acl?
should "acl all src all" be "acl all redirect all" ?
No to all three. The above is all line you mention trying is all thats
needed.

 url_rewrite_access allow all

but the above should be the default when a url_rewrite_program  is set.

so how do you tell it to use the url_rewrite_program with the inside site?
Or does it use the script on all pages passing through the proxy?

It changes the request as passed on to the web server in-transit. So the client still sees what they clicked on, but gets content from the other site. Does not affect link or such on the page content.


Is this only a rewrite on the requested url from the web browser?
Ahh that might answer some of my questions before. I never tried
clicking on it after implementing the rewrite script. I was only
hovering over the url and seeing that it was still the same.

What is making you think its not working? and what do the logs say about it?

If you only checked the pages links, they may not change. Logs should show where the client went to and IP/name of server fetched from. Which would be the name of redirected server.

Also what is the c:/replace.pl code?


<=== start
#!c:\perl\bin\perl.exe
$| = 1;
$replace="<a href=http://inside/login.html.*?</a>";
$with="no login";
while ($INPUT=<>) {
$INPUT=~s/$replace/$with/gi;
print $INPUT;
}
<=== end

I think I see the problem now I guess I am looking for something else
besides url_rewrite maybe a full text replacement :-/

thats what your code wants, not what I pointed you to using.

You know I'm thinking you could get away without altering those pages, but just blocking external clients from visiting those URL.


and is it possible to filter/replace certain words on the site

like replace "Albuquerque" with "Duke City" for an example on all pages?
No. no. no. Welcome to copyright violation hell.
This was an example. I have full permission to do the real translations.
I am told to remove certain links/buttons to login pages. thus I
replace "<a herf=inside>button</a>" with "" Currently I have a
pathetic perl script that doesn't support cookies and is gong through
each set of previous pages to bring up the content. I was hoping squid
would greatly simplify this.
I was using www::mechanize I know this isn't the best way but they
just need a fast and dirty way.
Ah, okay. Well the only ways squid has for doing content alteration is far
too much as well for that use. (coding up an ICAP server and processing
rules or a full eCAP adaptor plugin).

IMO you need to kick the webapp developers to make their app do the removal
under the right conditions. It would solve many more problems than having
different copies of a page available with identical identifiers.

Amos
--
Please be using
 Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
 Current Beta Squid 3.1.0.7



--
Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
  Current Beta Squid 3.1.0.7

[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux