Re: Squid->DG->Squid

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Wed, 27 Jul 2011 18:15:09 +1200

On 27/07/11 00:12, Andy Rogers wrote:
Hi

Over the past month I have been setting up and implementing a Squid3
Setup which uses SSO against Windows DC's, after originally following
the excellent guide from
http://howtoforge.com/debian-squeeze-squid-kerberos-ldap-authentication-active-directory-integration-and-cyfin-reporter.

I then decided to take this one step further and introduce
DansGuardian into the loop for Content Filtering.

Se that word "loop". 'ts important.

The Setup Iam working with is SquidProxy->DG->SquidProxy.

The well known "sandwich" setup is designed for use with three proxies:
  two Squid and one DG.

You are hitting the equally well known problem when you try to perform 
it with only two proxies: one squid, one DG.

The First is Squid to Authorise against AD and listen on Port 8080,
then pass onto DG for Content Filtering on Port 8081, and then DG to
pass back to the original single instance of Squid running back to
Port 3128.

This does work very well, but after studying some logs I don't think
much is being pulled from the Squid Cache, most of my squid access.log
is like this

---
1311675442.422    228 localhost TCP_MISS/200 2093 GET
http://static.howtoforge.com/images/teaser/ubuntu.gif -
DIRECT/178.63.27.110 image/gif
1311675442.423    268 192.168.22.107 TCP_MISS/200 2499 GET
http://static.howtoforge.com/images/teaser/ubuntu.gif user@MY.LOCAL
FIRST_UP_PARENT/127.0.0.1 image/gif
1311675442.473    134 localhost TCP_MISS/200 4103 GET
http://static.howtoforge.com/themes/htf_glass/images/star_vmware_image_red.png
- DIRECT/178.63.27.110 image/png
1311675442.473    166 192.168.22.107 TCP_MISS/200 4509 GET
http://static.howtoforge.com/themes/htf_glass/images/star_vmware_image_red.png
user@MY.LOCAL FIRST_UP_PARENT/127.0.0.1 image/png
1311675442.540    131 localhost TCP_MISS/200 1313 GET
http://howtoforge.com/themes/htf_glass/images/next_page.gif -
DIRECT/188.40.16.205 image/gif
1311675442.540    161 192.168.22.107 TCP_MISS/200 1719 GET
http://howtoforge.com/themes/htf_glass/images/next_page.gif
user@MY.LOCAL FIRST_UP_PARENT/127.0.0.1 image/gif
---

And the corresponding DG access.log shows

---
2011.7.26 11:17:18 user@my.local 192.168.22.107 http://howtoforge.com
*SCANNED*  GET 63490 -320  1 200 text/html   -v
---

Also in my cache.log I have spotted on almost all website's iam
getting this "WARNING: Forwarding loop detected for:" :-

---
2011/07/26 11:11:51| WARNING: Forwarding loop detected for:
GET /debian-squeeze-squid-kerberos-ldap-authentication-active-directory-integration-and-cyfin-reporter
HTTP/1.0
<snip,snip>
Via: 1.1 squid.my.local (squid/3.1.14)
X-Forwarded-For: 192.168.22.107
X-Forwarded-For: 192.168.22.107
---

192.168.22.107 just connected in,
 forwarding for 192.168.22.107 via squid.my.local which is
 forwarding for 192.168.22.107 ...

How was Squid to know "..." wasn't going to be infinite series of 
192.168.22.107 via itself?

  Is their anything incorrect or need adding/changing to my squid.conf
so I can get squid to get a better hit rate with the Cache stored&
also to remove the "WARNING: Forwarding loop detected for" message in
my cache.log.
Would I need to setup 2 separate squid instances instead of
Squid1->DG->Squid2 instead?

This is the sandwich configuration looping on itself. You haev several 
choices:

 * configure two instances of Squid
 * configure client->DG->Squid
 * configure client->Squid->DG
 * disable "via off" and cross your fingers there is never any actual 
infinite loop. You will need a Squid built with HTTP violations enabled 
to do that. Loop protection is REQUIRED by the HTTP standards.

This only solves the loop issue though. Cache MISS is separate...

 If I do, how would I need to alter my
current squid.conf into 2 separate files?

copy-n-paste the /etc/init.d/squid3 file and make each load a different 
config file into squid with the -f startup option.

Then make each config file to do the job you want its squid to do. 
Doesn't matter which instance is which as long as the config is clear 
what it connects up to.

On the whole, since you have requests that skip DG, and some get cached 
you are not in a position to only have caching on the second loop. 
Things entering the first loop WILL be found in the cache even if you 
wanted them to go through DG first.

 I think you are in a safe position to simply choose the 
client->Squid->DG option. Doing auth and caching in Squid with only 
things which need to go through DG doing so.
 OR, to pick the client->DG->Squid option for just clients which need 
to go through DG. Others can be straight client->Squid.

My squid.conf

---
####### /etc/squid3/squid.conf Configuration File #######
####### cache manager
cache_mgr squid@xxxxxxxxxxxxxxx

####### kerberos authentication
auth_param negotiate program /usr/lib/squid3/squid_kerb_auth -d -s
HTTP/proxy.my.local
auth_param negotiate children 10
auth_param negotiate keep_alive on

####### provide access via ldap for clients not authenticated via kerberos
auth_param basic program /usr/lib/squid3/squid_ldap_auth -R \
      -b "dc=my,dc=local" \
      -D squid@my.local \
      -w "password" \
      -f sAMAccountName=%s \
      -h dc.my.local
auth_param basic children 10
auth_param basic realm Internet Proxy
auth_param basic credentialsttl 1 minute
####### ldap authorizations
# restricted proxy access logged
external_acl_type internet_users %LOGIN /usr/lib/squid3/squid_ldap_group -R -K \
      -b "dc=my,dc=local" \
      -D squid@my.local \
      -w "password" \
      -f "(&(objectclass=person)(sAMAccountName=%v)(memberof=cn=Internet
Users,ou=Internet Groups,dc=my,dc=local))" \
      -h dc.tg.local
# full proxy access no logging
external_acl_type internet_users_full_nolog %LOGIN
/usr/lib/squid3/squid_ldap_group -R -K \
      -b "dc=my,dc=local" \
      -D squid@my.local \
      -w "password" \
      -f "(&(objectclass=person)(sAMAccountName=%v)(memberof=cn=Internet
Users Full NoLog,ou=Internet Groups,dc=my,dc=local))" \
      -h dc.tg.local
# full proxy access logged
external_acl_type internet_users_full_log %LOGIN
/usr/lib/squid3/squid_ldap_group -R -K \
      -b "dc=my,dc=local" \
      -D squid@my.local \
      -w "password" \
      -f "(&(objectclass=person)(sAMAccountName=%v)(memberof=cn=Internet
Users Full Log,ou=Internet Groups,dc=my,dc=local))" \
      -h dc.tg.local

####### acl for proxy auth and ldap authorizations
acl auth proxy_auth REQUIRED

# format "acl, aclname, acltype, acltypename, activedirectorygroup"
acl RestrictedAccessLog external internet_users Internet\ Users
acl FullAccessNoLog external internet_users_full_nolog Internet\
Users\ Full\ NoLog
acl FullAccessLog external internet_users_full_log Internet\ Users\ Full\ Log

Couple of notes:
 1) \-escape syntax does not work here. Whitespace needs to be 
url-encoded if the backend accepts that. squid_ldap_group is not one of 
those though.

The solution is to load the group name(s) from a fiel into the acl line. 
And in the file /etc/squid3/InternetUsersFullLog exactly one line:
   Internet Users Full Log

When that this is done you will file the line from that file gets passed 
as the group value to the external_acl_type helper.
Which means you no longer have to hard-code the group name into six 
different places in the config. One external_acl_type which makes use of 
the group value "%g" can be used for all the external "acl" lines.

Example:

  external_acl_type group %LOGIN \
    /usr/lib/squid3/squid_ldap_group -R -K \
        -b "dc=my,dc=local" \
       -D squid@my.local \
       -w "password" \
        -f 
"(&(objectclass=person)(sAMAccountName=%v)(memberof=cn=%g,dc=my,dc=local))" 
\
       -h dc.tg.local

  acl FullAccessLog external group "/etc/squid3/IUFullLog"
  acl FullAccessNoLog external group "/etc/squid3/IUFullNoLog"
  ...

acl whitelistsites url_regex -i "/etc/squid3/whitelistsites.txt"
acl blockedsites url_regex -i "/etc/squid3/blockedsites.txt"

####### squid defaults
acl manager proto cache_object
acl localhost src 127.0.0.1/32 ::1
acl to_localhost dst 127.0.0.0/8 0.0.0.0/32 ::1
acl SSL_ports port 443
acl Safe_ports port 80-81       # http
acl Safe_ports port 21          # ftp
acl Safe_ports port 443       # https
acl Safe_ports port 70          # gopher
acl Safe_ports port 210         # wais
acl Safe_ports port 1025-65535  # unregistered ports
acl Safe_ports port 280         # http-mgmt
acl Safe_ports port 488         # gss-http
acl Safe_ports port 591         # filemaker
acl Safe_ports port 777         # multiling http
acl CONNECT method CONNECT
http_access allow manager localhost
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localhost

####### enforce auth: order of rules is important for authorization levels
no_cache deny whitelistsites

The "no_" bit there is a bad typo. Remove it and what this line actually 
does will become clear.

I think usually you want to accept whitelisted things. So that should be 
a blacklist, or the whole line should be erased.

http_access allow whitelistsites
http_access allow FullAccessNoLog auth
http_access allow FullAccessLog auth
http_access deny blockedsites
http_access allow RestrictedAccessLog auth

####### logging
# don't log FullAccessNoLog
access_log /var/log/squid3/access.log squid !FullAccessNoLog

####### squid defaults
http_access deny all

#Log Connecting Client DNS Names instead on IP Names.
log_fqdn on

http_port 127.0.0.1:3128
http_port 8080

##Push Traffic Through DansGuradian for Content Filtering
cache_peer 127.0.0.1 parent 8081 0 no-query proxy-only no-delay
no-netdb-exchange no-digest connect-timeout=15 login=*:password

Why not just "login=PASS" (exact text) to relay the users credentials?

When these new altered details loop back to the current squid it may see 
a user logged in twice with two different passwords from two different 
sources. Could be bad. Particularly since you are basing your diversion 
through DG to be done on the group name instead of whether the request 
has already been through this squid once.

I think you want to make this the top of the peer access controls:
 # already passed to DG once. Don't do it twice.
 cache_peer_access 127.0.0.1 deny localhost

##FullInteret Users UnFiltered will only go through Squid&  will not
go through Dansguardian).
cache_peer_access 127.0.0.1 allow RestrictedAccessLog
cache_peer_access 127.0.0.1 deny all

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.14
  Beta testers wanted for 3.2.0.10