Hi, We have a fairly simple (in theory) use case where we have a bunch of headless Chromium browsers connecting to websites on the Internet through various geo-specific proxies. To speed things up, we'd like to add a caching layer, since it's perfectly acceptable for us to honor all max-age/expires/etc. headers for all of the accessed content. Nearly all accesses use https, so we've had to implement SSLBump, and we went with squid 5. That part seems to work well enough. We initially went with multiple servers configured as cache peers, but since we've been seeing a lot of different problems, we're now focusing on a single squid 5.0.6 server. It has 128GB RAM, a 16 core EPYC CPU, 3TB+ of NVMe storage and 1Gbps Internet bandwidth, which we'd obviously like to use as much as possible. What we have configured is: * Multiple http_port with a cache_peer each to access remote geo specific proxies. We've had to rebuild with -DMAXTCPLISTENPORTS=512 to increase the 128 default. Example (sorry for the line breaks): acl port_usa1 localport 21083 http_port 21083 ssl-bump cert=/etc/squid/ssl_cert/myCA.pem \ generate-host-certificates=on dynamic_cert_mem_cache_size=32MB cache_peer 198.51.100.66 parent 443 3130 no-query no-digest no-delay \ name=usa1 cache_peer_access usa1 allow port_usa1 * Simple SSLBump setup, where we don't check origins for cached objects to avoid the added latency, use a local CA and have Chromium configured to ignore all SSL/TLS mismatches: sslcrtd_program /usr/lib64/squid/security_file_certgen -s \ /var/cache/squid/ssl_db -M 32MB acl step1 at_step SslBump1 acl step2 at_step SslBump2 acl step3 at_step SslBump3 ssl_bump client-first * Memory and disk cache to try and use resources as much as possible: workers 4 cache_mem 81920 MB memory_cache_shared on shared_transient_entries_limit 65536 minimum_object_size 0 KB maximum_object_size 20 MB maximum_object_size_in_memory 2048 KB #cache_dir rock /var/spool/squid 3453640 max_filedescriptors 16384 We've tried a lot of other configuration options, read a lot of documentation, but we're still getting a lot of errors in the logs. Here are the most worrying: assertion failed: Transients.cc:221: "old == e" When that "assertion failed" happens, the kid dies and a new one gets forked in its place. We can see that happen multiple times per minute. ERROR: Collapsed forwarding queue overflow for kid1 at 1024 items This one seems to be impossible for us to track down. It doesn't show up immediately, but always ends up coming back, and can be multiple times per second when we have a high usage peak. We've tried: * Enabling/disabling "collapsed_forwarding", nothing changes. It should be off by default, but this message is there nevertheless. * Recompiling squid with the value raised to 4096. Same message with the new value. * Disabling the "cache_dir rock". It seems to then take longer to appear, but does ultimately appear again. Could anyone provide pointers on how to track down what could be causing these two errors? We can provide configuration, logs, traces and dumps as needed. Cheers, Matthias -- Matthias Saou ██ ██ ██ ██ Web: http://matthias.saou.eu/ ██████████████ Mail/XMPP: matthias@xxxxxxx ████ ██████ ████ ██████████████████████ GPG: 4096R/E755CC63 ██ ██████████████ ██ 8D91 7E2E F048 9C9C 46AF ██ ██ ██ ██ 21A9 7A51 7B82 E755 CC63 ████ ████ _______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx http://lists.squid-cache.org/listinfo/squid-users