I think I know what the issue is which can give us a clue to what is going on.
2018/10/15 15:05:45.083 kid1| RegexData.cc(71) match: aclRegexData::match: checking 'wiki.squid-cache.org:443'2018/10/15 15:05:45.084 kid1| RegexData.cc(82) match: aclRegexData::match: looking for '(^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org/SquidFaq/SquidAcl.*)'2018/10/15 15:05:45.084 kid1| Acl.cc(321) checklistMatches: ACL::ChecklistMatches: result for 'whitelist' is 0
The above seems to be applying the regex to "wiki.squid-cache.org:443" instead of to "https://wiki.squid-cache.org/SquidFaq/SquidAcl". I added the regex ".*squid-cache.org.*" to my list of regular expressions and now I see this.
2018/10/15 15:16:03.641 kid1| RegexData.cc(71) match: aclRegexData::match: checking 'wiki.squid-cache.org:443'2018/10/15 15:16:03.641 kid1| RegexData.cc(82) match: aclRegexData::match: looking for '(^https?://[^/]+/wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org.*)'2018/10/15 15:16:03.641 kid1| RegexData.cc(93) match: aclRegexData::match: match '(^https?://[^/]+/wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org.*)' found in 'wiki.squid-cache.org:443'2018/10/15 15:16:03.641 kid1| Acl.cc(321) checklistMatches: ACL::ChecklistMatches: result for 'whitelist' is 1
Any idea why url_regex wouldn't try to match the full url and instead only matches on the subdomain, host domain, and port?
The Squid FAQ says the following:
url_regex: URL regular _expression_ pattern matchingurlpath_regex: URL-path regular _expression_ pattern matching, leaves out the protocol and hostname
with this example given
acl special_url url_regex ^http://www.squid-cache.org/Doc/FAQ/$
This seems to be the case between 3.3.8 (default on ubuntu 14.04) and 3.5.12 (default on ubuntu 16.04).
Is there another configuration that forces url_regex to match the entire url? or should I use a different acl type?
Best,
On Mon, Oct 15, 2018 at 11:11 AM RB <ronthecon@xxxxxxxxx> wrote:
Hi Matus,Thanks for responding so quickly. I uploaded my configurations here if that is more helpful: https://bit.ly/2NF4zNbThe config that I previously shared is called squid_corp.conf. I also noticed that if I don't use regular expressions and instead use domains, it works correctly:# acl whitelist url_regex "/vagrant/squid_sites.txt"acl whitelist url_regex .squid-cache.orgEvery time my squid.conf or my squid_sites.txt is modified, I restart the squid servicesudo service squid3 restartThen I use curl to test and now the url works.$ curl -sSL --proxy localhost:3128 -D - https://wiki.squid-cache.org/SquidFaq/SquidAcl -o /dev/null 2>&1HTTP/1.1 200 Connection establishedHTTP/1.1 200 OKDate: Mon, 15 Oct 2018 14:47:33 GMTServer: Apache/2.4.7 (Ubuntu)Vary: Cookie,User-Agent,Accept-EncodingContent-Length: 101912Cache-Control: max-age=3600Expires: Mon, 15 Oct 2018 15:47:33 GMTContent-Type: text/html; charset=utf-8But this does not allow me to get more granular. I can only allow all subdomains and paths for the domain squid-cache.org but I'm unable to only allow the regular expressions if I put them inline or put them in squid_sites.txt.# acl whitelist url_regex "/vagrant/squid_sites.txt"acl whitelist url_regex ^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*acl whitelist url_regex .*squid-cache.org/SquidFaq/SquidAcl.*If I put them inline like I have above, when I restarted squid it says the following2018/10/15 14:54:48 kid1| strtokFile: .*squid-cache.org/SquidFaq/SquidAcl.* not foundIf I put the expressions in the squid_sites.txt the above "not found" message isn't shown and this is the debug output in /var/log/squid3/cache.log (full output https://pastebin.com/NVwRxVmQ).2018/10/15 15:05:45.083 kid1| Checklist.cc(275) matchNode: 0x7fb0068da2b8 matched=1 async=0 finished=02018/10/15 15:05:45.083 kid1| Acl.cc(336) matches: ACLList::matches: checking whitelist2018/10/15 15:05:45.083 kid1| Acl.cc(319) checklistMatches: ACL::checklistMatches: checking 'whitelist'2018/10/15 15:05:45.083 kid1| RegexData.cc(71) match: aclRegexData::match: checking 'wiki.squid-cache.org:443'2018/10/15 15:05:45.084 kid1| RegexData.cc(82) match: aclRegexData::match: looking for '(^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org/SquidFaq/SquidAcl.*)'2018/10/15 15:05:45.084 kid1| Acl.cc(321) checklistMatches: ACL::ChecklistMatches: result for 'whitelist' is 02018/10/15 15:05:45.084 kid1| Acl.cc(349) matches: whitelist mismatched.2018/10/15 15:05:45.084 kid1| Acl.cc(354) matches: whitelist result is falseSo it's failing the regular _expression_ check. If I use grep to verify if the regex works, it does.> are you aware that you can only see CONNECT in https requests, unless using
ssl_bump?Ah interesting. Are you saying that my https connections will always fail unless I use ssl_bump to decrypt https to http connections? How would this work correctly in production? Does squid proxy only block urls if it detects http? How do you configure ssl_bump to work in this case? and is that viable in production?> of course it matches all, everything should match "all".
> I more wonder why doesn't it match "http_access allow localhost"> have you reloaded squid config after changing it?
> Did squid confirm it?Would you have an example of one entire config file that would work to whitelist an http/https url using a regular _expression_?Best,On Mon, Oct 15, 2018 at 4:49 AM Matus UHLAR - fantomas <uhlar@xxxxxxxxxxx> wrote:KOn 15.10.18 01:04, RB wrote:
>I'm trying to deny all urls except for only whitelisted regular
>expressions. I have only this regular _expression_ in my file
>"squid_sites.txt"
>
>^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*
are you aware that you can only see CONNECT in https requests, unless using
ssl_bump?
>acl bastion src 10.5.0.0/1
>acl whitelist url_regex "/vagrant/squid_sites.txt"
[...]
>http_access allow manager localhost
>http_access deny manager
>http_access deny !Safe_ports
>http_access allow localhost
>http_access allow purge localhost
>http_access deny purge
>http_access deny CONNECT !SSL_ports
>
>http_access allow bastion whitelist
>http_access deny bastion all
>I tried enabling debugging and tailing /var/log/squid3/cache.log but my
>curl statement keeps matching "all".
of course it matches all, everything should match "all".
I more wonder why doesn't it match "http_access allow localhost"
>$ curl -sSL --proxy localhost:3128 -D - "
>https://wiki.squid-cache.org/SquidFaq/SquidAcl" -o /dev/null 2>&1 | grep
>Squid
>X-Squid-Error: ERR_ACCESS_DENIED 0
>Any ideas what I'm doing wrong?
have you reloaded squid config after changing it?
Did squid confirm it?
--
Matus UHLAR - fantomas, uhlar@xxxxxxxxxxx ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
It's now safe to throw off your computer.
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx http://lists.squid-cache.org/listinfo/squid-users