Re: How squid sends sni to icap server?

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Mon, 7 Aug 2017 22:04:30 +1200

On 07/08/17 08:11, lucas.alvaro@xxxxxxxxxxx wrote:

 >
 >> adaptation_meta X-SNI "%ssl::>sni" all   #or connect
 >> #request_header_add X-SNI "%ssl::>sni" all
 >> "
 >>
 >>
 >> So i want to create an icap service like squidclamav but it must check
 >> SNI not URLs.
 >
 >Any particular reason why?
 > SNI has almost nothing to do with the HTTP messages (plural). It is
 > simply the name of the next-hop server (or proxy) they should be
 > delivered to on their way around the web.
 >
 >I thought squidclamav was an antivirus, not a URL blocklist checker.
 >
You're right: squidclamav is an antivirus but there are much more 
services, actually he can check url and match them to blacklist or 
whitelist.
I don't want to decrypt https trafic but i want to know where the client 
is trying to connect. I thought SNI was the only way to know the server 
name and the domain without decrypting anything.

Sort of yes, and sort of no. SNI is the name of the server the client 
wants to connect to. But it is not necessarily of any relation to the 
HTTPS message URL-domain. It could be any of the many names each server 
has pointing to it. eg. a private/internal hostname or a virtual-host 
domain. The HTTPS message may go to that same name, or to any of the 
servers other ones.
 With HTTP/2 becoming more popular the Alt-Svc / ALTSVC feature is 
getting more traction. Where the SNI can be expected to contain the 
alternative servers name and the HTTPS message URL has the exact 
domain/URL wanted from that server.

A slightly more accurate value is the ServerHello cert SubjectAltName 
field which lists the names the server is publicly advertising itself to 
be. That is also available without decrypting using a peek/stare at 
step2 of SSL-Bump.
 BUT, that field is more accurate because it can and often does contain 
a whole list of the servers various names including wildcard sub-domains 
- which reflects the reality of what a "site" actually looks like. 
Despite many of us humans thinking a site/domain is a singular thing, it 
is actually a messy collection of pieces.

These details are all part of why ssl::server_name exists separate from 
the more familiar dstdomain ACL type.

IMHO if you want your service to cope well with virtual hosting etc 
sending it both the SNI and the full SubjectAltName set of values would 
be best. Then it can decide whether any of those details is needing a 
block or safe to allow.

Final goal is to blacklist for exemple google and when sni indicates 
www.google.com, c-icap denies the access.

 >
 >> I peek all the steps to get sni and in the squid access log, sni is
 >> printed .
 >>
 >> I read that adaptation_meta can send anything from squid to icap but
 >> clearly i use it incorretly: i can't see sni on icap access log or in
 >> icap headers.
 >
 > Your usage appears to be correct. I think there is no SNI being received
 > by Squid.

That's problematic because in my squid access log there are 
"www.youtube.com" "www.google.com", that's exactly what i'm tryng to 
pass to c-icap. Seems like squid receives the sni.

FWIW; Squid gets the values from:
 a) CONNECT tunnel request-target, or
 b) SNI, or
 c) server cert SubjectAltName, or
 d) decrypted HTTPS message URL, or
 e) reverse-DNS of the TCP dst-IP address.
In that order AFAIK. So if any of the non-SNI details becomes available 
Squid can log a name.

That said I do think you may be hitting a bug in Squid SNI handling, its 
not perfect yet, particularly in the Squid-3.x code. So a traffic 
analysis with wireshake or similar would be useful at this point to 
check and confirm whether SNI is given or one of those others happening.

Amos
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users