‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On June 14, 2018 1:25 PM, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote: > On 14/06/18 07:28, baretomas wrote: > > > Hello, > > > > I'm setting up a Squid proxy as a cache for a number (as many as possible) > > > > of identical JAVA applications to run their web calls through. The calls are > > > > ofc identical, and the response they get can safely be cached for 5-10 > > > > seconds. > > > > I do this because most of the calls is directed at a single server on the > > > > internet that I don't want to hammer, since I will ofc be locked out of it > > > > then. > > > > Currently Im simply testing this on a single computer: the application and > > > > squid > > > > The calls from the application is done using ssl / https by telling java to > > > > use Squid as a proxy (-Dhttps.proxyHost and -Dhttp.proxyHost). I've set up > > > > squid and JAVA with self-signed certificates, and the application sends its > > > > calls through squid and gets the reponse. No problem there (wasnt easy that > > > > either I must say :P ). > > I was going to ask what was so hard about it. Then I looked at your > > config and see that your are in fact using NAT interception instead of > > the easy way. > > So what exactly do those -D options cause the Java applications to do > > with the proxy? > > I have some suspicions, but am not familiar enough with Java API and > > the specific details are critical to what you need the proxy to be doing. > > > The problem is that none of the calls get cached: All rows in the access.log > > > > hava a TCP_MISS/200 tag in them. > > > > I've searched all through the web for a solution to this, and have tried > > > > everything people have suggested. So I was hoping someone could help me? > > > > Anyone have any tips on what to try? > > There are three ways to do this: > > 1. if you own the domain the apps are connecting to. Setup the proxy as > > a normal TLS / HTTPS reverse-proxy. > > 2. if you have enough control of the apps to get them connecting with > > TLS to the proxy and sending their requests there. Do that. > > 3. the (relatively) complicated SSL-Bump way you found. The proxy is > > fully at the mercy of the the messages sent by apps and servers. Caching > > is a luxury here, easily broken / prevented. > > Well, there is a forth way with intercept. But that is a VERY last > > resort and you already have (3) going and that is already better than > > intercept. Getting to (1) or (2) would be simplest if you meet the "if > > ..." requirements for those. > > > > MY config (note Ive set the refresh_pattern like that just to see if I could > > > > catch anything. The plan is to modify it so it actualyl does refresh the > > > > responses frmo the web calls in 5-10 seconds intervals. There are commented > > > > out pats Ive tried with no luck there too): > > ... > > Ah. The way you write that implies a misunderstanding about refresh_pattern. > > HTTP has some fixed algorithms written into the protocol that caches are > > required to perform to determine if any object stored can be used or > > requires replacement. > > The parameters used by these algorithms come in the form of headers in > > the originally stored reply message, the current clients request. > > Sometimes they require revalidation, which is a quick check with the > > server for updated instructions and/or content. > > What refresh_pattern actually does is provide default values for those > > algorithm parameters IF any one (or more) of them are missing from those > > HTTP messages. > > The proper way to make caching happen with your desired behaviour is for > > the server to present HTTP Cache-Control header saying the object is > > cacheable (ie does not forbid caching), but not for more than 10seconds. > > Cache-Control: max-age=10 > > OR to say that objects need revalidation, but presents a 304 status for > > revalidation checks. (ie Cache-Control:no-cache) (yeah, thats right, > > "no-cache" means do cache). > > That said, I doubt you really are wanting to force that and would be > > happy if the server was instructing the the proxy as being safe to cache > > an object for several minutes or any value larger than 10sec. > > So what we circle back to is that you are probably trying to force > > things to cache and be used long past their actual safe-to-use lifetimes > > as specified by the devs most authoritative on that subject (under > > 10sec?). As you should be aware, this is highly unsafe thing to be doing > > unless you are one of those devs - be very careful what you choose to do. > > > Squid normally listens to port 3128 > > =================================== > > > > #http_port 3128 ssl-bump generate-host-certificates=on > > > > dynamic_cert_mem_cache_size=4MB cert=/cygdrive/c/squid/etc/squid/correct.pem > > > > key=/cygdrive/c/squid/etc/squid/ssl/myca.key > > > > http_port 3128 ssl-bump generate-host-certificates=on > > > > dynamic_cert_mem_cache_size=4MB > > > > cert=/cygdrive/c/squid/etc/squid/proxyCAx.pem > > > > key=/cygdrive/c/squid/etc/squid/proxyCA.pem > > > > #https_port 3129 cert=/cygdrive/c/squid/etc/squid/proxyCAx.pem > > > > key=/cygdrive/c/squid/etc/squid/proxyCA.pem > > Hmm. This is a Windows machine running Cygwin? > > FYI: Performance is going to be terrible. It may not be super relevant > > yet. Just be aware that Windows imposes limitations on usable sockets > > per application - which is much smaller than a typical proxy requires. > > The Cygwin people do a lot but they cannot solve some OS limitation > > problems. > > To meet your very first sentence "as many as possible" requirement you > > will need a non-Windows machine to run the proxy on. That simple change > > will get you something around 3 orders of magnitude higher peak client > > capacity on the proxy. > > > Uncomment the line below to enable disk caching - path format is > > ================================================================ > > > > /cygdrive/<full path to cache folder>, i.e. > > > > #cache_dir aufs /cygdrive/c/squid/var/cache/ 3000 16 256 > > > > certificate generation program > > ============================== > > > > sslcrtd_program /cygdrive/c/squid/lib/squid/ssl_crtd -s > > > > /cygdrive/c/squid/var/cache/squid_ssldb -M 4MB > > > > Leave coredumps in the first cache dir > > ====================================== > > > > coredump_dir /var/cache/squid > > > > Add any of your own refresh_pattern entries above these. > > ======================================================== > > > > #refresh_pattern ^ftp: 1440 20% 10080 > > > > #refresh_pattern ^gopher: 1440 0% 1440 > > > > #refresh_pattern -i (/cgi-bin/|?) 0 0% 0 > > > > #refresh_pattern -i (/cgi-bin/|?) 1440 100% 4320 ignore-no-store > > > > override-lastmod override-expire ignore-must-revalidate ignore-reload > > > > ignore-private ignore-auth > > > > refresh_pattern . 1440 100% 4320 ignore-no-store override-lastmod > > > > override-expire ignore-must-revalidate ignore-reload ignore-private > > > > ignore-auth override-lastmod > > - ignore-must-revalidate actively reduces caching. Because it disables > > several of the widely used HTTP mechanisms that rely on revalidation to > > allow things to be stored in a cache. > > It is only beneficial if the server is broken; requiring revalidation > > plus not supporting revalidation. > > - ignore-auth same un-intuitive effects as ignoring revalidation, again > > reducing caching ability. > > This is only useful if you want to prevent caching of contents which > > require any form of login to view. High security networks dealing with > > classified or confidential materials find this useful - regular Internet > > admin not so much. > > - ignore-no-store is highly dangerous and rarely necessary. The "nuclear > > option" for caching. It has the potential to eradicate user privacy and > > scramble up any server personalized content (not in a good way). > > This is a last resort intended only to copy with severely braindead > > applications. YMMV whether you have to deal with any of those - just > > treat this an absolute last resort rather than something to play with. > > Overall - in order to use these refresh-pattern controls you need to > > know what the HTTP(S) messages going through your proxy contain in terms > > of caching headers AND what those messages are doing semantically / > > content wise for the client application. Using any of them as a generic > > "makes caching better" thing only leads to problems in todays HTTP protocol. > > > > Bumped requests have relative URLs so Squid has to use reverse proxy > > ==================================================================== > > > > or accelerator code. By default, that code denies direct forwarding. > > ==================================================================== > > > > The need for this option may disappear in the future. > > ===================================================== > > > > #always_direct allow all > > > > dns_nameservers 8.8.8.8 208.67.222.222 > > Use of 8.8.8.8 is known to be explicitly detrimental to caching > > intercepted traffic. > > Those servers present different result sets based on the timing and IP > > sending the query. The #1 requirement of caching intercepted (or > > SSL-Bump'ed) content is that the client and proxy have the exact same > > view of DNS system contents. Having the DNS reply contents change > > between two consecutive and identical queries breaks that requirement. > > > max_filedescriptors 3200 > > > > Max Object Size Cache > > ===================== > > > > maximum_object_size 10240 KB > > > > acl step1 at_step SslBump1 > > > > ssl_bump peek step1 > > > > ssl_bump bump all > > This causes the proxy to attempt decryption of the traffic using crypto > > algorithms based solely on the ClientHello details and its own > > capabilities. There is zero server crypto capabilities known for the > > proxy to use to ensure traffic can actually make it to the server. > > You are rather lucky that it actually worked at all. Almost any > > deviation (ie emergency security updates in future) at either client or > > server or proxy endpoints risks breaking the communication through this > > proxy. > > Ideally there would be a stare action for step2 and them bump only at > > step 3. > > So in summary to the things to try to get better caching: > > - ditch 8.8.8.8. Use a local DNS resolver within your own network, > > shared by clients and proxy. That can use 8.8.8.8 itself, the important > > part is that it should be responsible for caching DNS results and > > ensuring the app clients and Squid see as much the same records as possible. > > - try "debug_options 11,2" to get a cache.log of the HTTP(S) headers for > > message being decrypted in the proxy. Look at those headers to see why > > they are not caching normally. Use that info to inform your next > > actions. It cannot tell you how the message is used by the application, > > hopefully you can figure that out somehow before forcing anything unnatural. > > - if you can, try pasting some of the transaction URLs into the tool at > > redbot.org to see if there are any HTTP level mistakes in the apps that > > could be fixed for better cacheability. > > Amos Very much thanks for this very informative post to my question! I will spend some time understanding it, and try out the things you suggest! Thanks again! _______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx http://lists.squid-cache.org/listinfo/squid-users