On 10/04/2017 1:36 p.m., daveh wrote: > Thanks for the reply. > > Im parsing squid logs to send to a SIEM to identify IOCs. The SIEM agent > requires a URL to be formatted with http|https://<URI> > > It knows then that it can break the string out into various components such > as request URL authority, host etc So it can understand *URL* format. But that is not what is being logged. Squid technically logs a URI, and this log processing is one of the cases were the difference between URI and URL matters. > > Your comment on logging https connections is not what I have found. I would I think you misread what I wrote. There are only two ways to get Squid to know what the https:// URL was - neither of them are normal proxy usage. > expect that typing https://something.net will return that extact string in > the log. Every https connection is logged as a CONNECT with the FQDN > appended the :443. You expect wrong. The URL you entered into some client software starts with the schema "https://" ... which requires that the fetching of that URL is done securely. The last thing you should expect is that URL being sent over plain-text / "in the clear" to some external software. To do HTTPS the client software has to setup multiple layers of protocols and security. 1) First it has to open a TCP connection to the proxy. 2) It does then have to tell the proxy where it is going to. But no more than that. Thus the CONNECT request. As per <https://tools.ietf.org/html/rfc7230#section-5.3.3> all that any plain-text connection to a proxy contains is: CONNECT www.example.com:443 HTTP/1.1 3) Then it has to setup TLS/SSL encryption over those two TCP connections. So the crypto happens directly between the client and the server (as if the proxy were not there). 4) Then, and only then, after all that has been successful does it start to send the first (or potentially many, hundreds, thousands...) of HTTP requests over the connection: GET /index.html HTTP/1.1 Host: example.com ... If you look closely at that #4 layer request there is no "https://" there. Nor any way to reconstruct it. It might even be another CONNECT (thought TOR invented onion routing? HTTPS beat it by decades). That meme from The Matrix "there is no spoon" has never been more apt. There is no "https://" - at least, not once the client interprets its input URL. It vanishes right there and then. > Is there something in the config to force this to happen? There is no simple config option. In fact we go out of our way to ensure data accuracy. So the log contains reality and log interpreters can make whatever assumptions you want it to about what they read there. p-PS. I find it particularly odd that you would be trying to feed false information into a SIEM system - security event detection depends on accuracy of inputs. But its your neck. > DOesnt seem to be a way of doing it with log formatting > There is that logformat directive and the codes I gave in my earlier mail. <http://www.squid-cache.org/Doc/config/logformat/> and "%>rs://%>rd:%>rP%>rp" If the %>rs is not producing a scheme for CONNECT transactions you could hard-code "https". Either way its a good idea to log these faked-up records to a different log all of their own. Use the access_log directive to setup multiple outputs: <http://www.squid-cache.org/Doc/config/access_log/> Amos _______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx http://lists.squid-cache.org/listinfo/squid-users