In addition to what Matus and Alex have already said about your problem, you do not appear to understand regex patterns properly. On 16/10/18 4:11 AM, RB wrote: > Hi Matus, > > Thanks for responding so quickly. I uploaded my configurations here if > that is more helpful: https://bit.ly/2NF4zNb > > The config that I previously shared is called squid_corp.conf. I also > noticed that if I don't use regular expressions and instead use domains, > it works correctly: > > # acl whitelist url_regex "/vagrant/squid_sites.txt" > acl whitelist url_regex .squid-cache.org This is still a regex. The ACL type is "url_regex" which makes the string a regex - no matter what it looks like to your human eyes. To Squid it is a regex. It will match things like http://example.com/sZsquid-cacheXORG just easily as any sub-domain of squid-cache.org. For example any traffic injecting our squid-cache.org domain into their path or query-string. > > Every time my squid.conf or my squid_sites.txt is modified, I restart > the squid service > > sudo service squid3 restart > If Squid does not accept the config file it will not necessarily restart. You should always run "squid -k parse" or "squid3 -k parse" to check the config before attempting a restart. The old Debian sysV init scripts had some protections that would protect you from problems. But the newer systemd "service" systems are not able to do that in a nice way. The habit is a good one to get into anyway. > > Then I use curl to test and now the url works. > > $ curl -sSL --proxy localhost:3128 -D - > https://wiki.squid-cache.org/SquidFaq/SquidAcl-o /dev/null 2>&1 > HTTP/1.1 200 Connection established > > HTTP/1.1 200 OK > Date: Mon, 15 Oct 2018 14:47:33 GMT > Server: Apache/2.4.7 (Ubuntu) > Vary: Cookie,User-Agent,Accept-Encoding > Content-Length: 101912 > Cache-Control: max-age=3600 > Expires: Mon, 15 Oct 2018 15:47:33 GMT > Content-Type: text/html; charset=utf-8 > > > But this does not allow me to get more granular. I can only allow all > subdomains and paths for the domain squid-cache.org > <http://squid-cache.org> but I'm unable to only allow the regular > expressions if I put them inline or put them in squid_sites.txt. > > # acl whitelist url_regex "/vagrant/squid_sites.txt" > acl whitelist url_regex > ^https://wiki.squid-cache.org/SquidFaq/SquidAcl.* > acl whitelist url_regex .*squid-cache.org/SquidFaq/SquidAcl.* Any regex pattern that lacks the beginning (^) and ending ($) anchor symbols is always a match against *anywhere* in the input string. So starting it with an optional prefix (.* or .?) or ending it with an optional suffix (.* or .?) is pointless and confusing. Notice how the pattern Squid is actually using lacks these prefix/suffix parts or your patterns: > aclRegexData::match: looking for > '(^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org/SquidFaq/SquidAcl.*)' > >> are you aware that you can only see CONNECT in https requests, unless using > ssl_bump? > > Ah interesting. Are you saying that my https connections will always > fail They will always fail to match your current regex, because your current regex contain characters which are only ever existing in path portions of URLs (note the *L*). Never in a CONNECT message URI (note the *I*) which never contains any path portion. > unless I use ssl_bump to decrypt https to http connections? How > would this work correctly in production? Does squid proxy only block > urls if it detects http? How do you configure ssl_bump to work in this > case? and is that viable in production? SSL-Bump is to take the CONNECT tunnel data/payload portion and _attempt_ decrypt any TLS inside. *If* the tunnel contains HTTPS traffic (not guaranteed) that is where the full https:// ... URLs are found. Matus and Alex have already mentioned the issues with that so I wont cover it again. Amos _______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx http://lists.squid-cache.org/listinfo/squid-users