Search squid archive

Re: How to create a simple whitelist using regexes?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I think I know what the issue is which can give us a clue to what is going on.

2018/10/15 15:05:45.083 kid1| RegexData.cc(71) match: aclRegexData::match: checking 'wiki.squid-cache.org:443'
2018/10/15 15:05:45.084 kid1| RegexData.cc(82) match: aclRegexData::match: looking for '(^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org/SquidFaq/SquidAcl.*)'
2018/10/15 15:05:45.084 kid1| Acl.cc(321) checklistMatches: ACL::ChecklistMatches: result for 'whitelist' is 0

The above seems to be applying the regex to "wiki.squid-cache.org:443" instead of to "https://wiki.squid-cache.org/SquidFaq/SquidAcl". I added the regex ".*squid-cache.org.*" to my list of regular expressions and now I see this.

2018/10/15 15:16:03.641 kid1| RegexData.cc(71) match: aclRegexData::match: checking 'wiki.squid-cache.org:443'
2018/10/15 15:16:03.641 kid1| RegexData.cc(82) match: aclRegexData::match: looking for '(^https?://[^/]+/wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org.*)'
2018/10/15 15:16:03.641 kid1| RegexData.cc(93) match: aclRegexData::match: match '(^https?://[^/]+/wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org.*)' found in 'wiki.squid-cache.org:443'
2018/10/15 15:16:03.641 kid1| Acl.cc(321) checklistMatches: ACL::ChecklistMatches: result for 'whitelist' is 1

Any idea why url_regex wouldn't try to match the full url and instead only matches on the subdomain, host domain, and port? 

The Squid FAQ says the following:

url_regex: URL regular _expression_ pattern matching
urlpath_regex: URL-path regular _expression_ pattern matching, leaves out the protocol and hostname

with this example given

acl special_url url_regex ^http://www.squid-cache.org/Doc/FAQ/$

This seems to be the case between 3.3.8 (default on ubuntu 14.04) and 3.5.12 (default on ubuntu 16.04).

Is there another configuration that forces url_regex to match the entire url? or should I use a different acl type?

Best,

On Mon, Oct 15, 2018 at 11:11 AM RB <ronthecon@xxxxxxxxx> wrote:
Hi Matus,

Thanks for responding so quickly. I uploaded my configurations here if that is more helpful: https://bit.ly/2NF4zNb

The config that I previously shared is called squid_corp.conf. I also noticed that if I don't use regular expressions and instead use domains, it works correctly:

# acl whitelist url_regex "/vagrant/squid_sites.txt"
acl whitelist url_regex .squid-cache.org

Every time my squid.conf or my squid_sites.txt is modified, I restart the squid service

sudo service squid3 restart

Then I use curl to test and now the url works. 

$ curl -sSL --proxy localhost:3128 -D - https://wiki.squid-cache.org/SquidFaq/SquidAcl -o /dev/null 2>&1
HTTP/1.1 200 Connection established

HTTP/1.1 200 OK
Date: Mon, 15 Oct 2018 14:47:33 GMT
Server: Apache/2.4.7 (Ubuntu)
Vary: Cookie,User-Agent,Accept-Encoding
Content-Length: 101912
Cache-Control: max-age=3600
Expires: Mon, 15 Oct 2018 15:47:33 GMT
Content-Type: text/html; charset=utf-8

But this does not allow me to get more granular. I can only allow all subdomains and paths for the domain squid-cache.org but I'm unable to only allow the regular expressions if I put them inline or put them in squid_sites.txt.

# acl whitelist url_regex "/vagrant/squid_sites.txt"
acl whitelist url_regex .*squid-cache.org/SquidFaq/SquidAcl.*

If I put them inline like I have above, when I restarted squid it says the following

2018/10/15 14:54:48 kid1| strtokFile: .*squid-cache.org/SquidFaq/SquidAcl.* not found

If I put the expressions in the squid_sites.txt the above "not found" message isn't shown and this is the debug output in /var/log/squid3/cache.log (full output https://pastebin.com/NVwRxVmQ).

2018/10/15 15:05:45.083 kid1| Checklist.cc(275) matchNode: 0x7fb0068da2b8 matched=1 async=0 finished=0
2018/10/15 15:05:45.083 kid1| Acl.cc(336) matches: ACLList::matches: checking whitelist
2018/10/15 15:05:45.083 kid1| Acl.cc(319) checklistMatches: ACL::checklistMatches: checking 'whitelist'
2018/10/15 15:05:45.083 kid1| RegexData.cc(71) match: aclRegexData::match: checking 'wiki.squid-cache.org:443'
2018/10/15 15:05:45.084 kid1| RegexData.cc(82) match: aclRegexData::match: looking for '(^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org/SquidFaq/SquidAcl.*)'
2018/10/15 15:05:45.084 kid1| Acl.cc(321) checklistMatches: ACL::ChecklistMatches: result for 'whitelist' is 0
2018/10/15 15:05:45.084 kid1| Acl.cc(349) matches: whitelist mismatched.
2018/10/15 15:05:45.084 kid1| Acl.cc(354) matches: whitelist result is false

So it's failing the regular _expression_ check. If I use grep to verify if the regex works, it does.


> are you aware that you can only see CONNECT in https requests, unless using
ssl_bump?

Ah interesting. Are you saying that my https connections will always fail unless I use ssl_bump to decrypt https to http connections? How would this work correctly in production? Does squid proxy only block urls if it detects http? How do you configure ssl_bump to work in this case? and is that viable in production?

> of course it matches all, everything should match "all".
> I more wonder why doesn't it match "http_access allow localhost"

have you reloaded squid config after changing it?
> Did squid confirm it?

Would you have an example of one entire config file that would work to whitelist an http/https url using a regular _expression_?

Best,


On Mon, Oct 15, 2018 at 4:49 AM Matus UHLAR - fantomas <uhlar@xxxxxxxxxxx> wrote:
KOn 15.10.18 01:04, RB wrote:
>I'm trying to deny all urls except for only whitelisted regular
>expressions. I have only this regular _expression_ in my file
>"squid_sites.txt"
>
>^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*

are you aware that you can only see CONNECT in https requests, unless using
ssl_bump?


>acl bastion src 10.5.0.0/1
>acl whitelist url_regex "/vagrant/squid_sites.txt"
[...]
>http_access allow manager localhost
>http_access deny manager
>http_access deny !Safe_ports
>http_access allow localhost
>http_access allow purge localhost
>http_access deny purge
>http_access deny CONNECT !SSL_ports
>
>http_access allow bastion whitelist
>http_access deny bastion all

>I tried enabling debugging and tailing /var/log/squid3/cache.log but my
>curl statement keeps matching "all".

of course it matches all, everything should match "all".

I more wonder why doesn't it match "http_access allow localhost"

>$ curl -sSL --proxy localhost:3128 -D - "
>https://wiki.squid-cache.org/SquidFaq/SquidAcl" -o /dev/null 2>&1 | grep
>Squid
>X-Squid-Error: ERR_ACCESS_DENIED 0

>Any ideas what I'm doing wrong?

have you reloaded squid config after changing it?
Did squid confirm it?

--
Matus UHLAR - fantomas, uhlar@xxxxxxxxxxx ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
It's now safe to throw off your computer.
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users

[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux