Re: url_rewrite_program and ACLs

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Fri, 10 Nov 2017 13:17:18 +1300

On 10/11/17 00:39, Vieri wrote:

________________________________
From: Amos Jeffries <squid3@xxxxxxxxxxxxx>

Darn. You have the one case that calls for keeping the helper :-(

You can still move the ACLs that load in a reasonable times into
squid.conf and leave the others in SG/ufdbguard. Using
url_rewrite_access to restrict which transactions the helper gets
involved with. That will reduce its latency impact on lie traffic, but
still cause much the same memory related (non-)issues as now.

That's exactly what I'm doing right now...
Thanks.

Running "squid -k shutdown" a _second_ time sends the running proxy a
signal to immediately skip to the processing as if the shutdown_lifetime
had already been reached.

Thanks for that double-shutdown signal trick. I'll have to try that asap.

I'm making progress (sort of) on the FD (non-)issues I'm having.

I'll try to post back to Alex asap.

I have a custom perl script that does MySQL lookups for blacklisted sites (lots of them - so I can't use ACLs within squid.conf). I define that helper with external_acl_type.

Yesterday I changed my squid.conf by disabling this helper, and used squidGuard instead.
I noticed a huge improvement.

Hmm that is suspicious. AFAIK SquidGuards' one remaining useful feature 
is how it loads large lists into memory in big chunks then processes 
after its technically already processing traffic from Squid - whereas 
Squid loads the files line by line, which is slower initially. Once 
loaded there is no difference in the lookup algorithms, and the SQL DB 
storage should be no different to how SG does it.

I would compare your custom script to the ext_sql_session_acl.pl.in 
script we bundle with current Squid.
 If yours lacks concurrency channel-ID I highly recommend adding that 
behaviour.
 If the DB is designed to store the protocol scheme, domain[:port] and 
path?query portions of URLs in separate columns it will be more 
efficient to pass those parameters as separate (%PROTO %DST %PORT %PATH) 
to the helper instead of just %URI.

The overheads in Squid of using external_acl_type helper interface 
should be slightly less than the url_rewrite_program for SG. The SQL DB 
data loading is about the same or better than what SG does AFAIK.

I took this snapshot yesterday:

15:25 08/11/2017:

File descriptor usage for squid:
Maximum number of file descriptors: 65536
Largest file desc currently in use: 2730
Number of file desc currently in use: 1838
Files queued for open: 0
Available number of file descriptors: 63698
Reserved number of file descriptors: 100
Store Disk files open: 0

Today I took another peak and found:

Thu Nov 9 12:19:05 CET 2017:

File descriptor usage for squid:
Maximum number of file descriptors: 65536
Largest file desc currently in use: 6980
Number of file desc currently in use: 6627
Files queued for open: 0
Available number of file descriptors: 58909
Reserved number of file descriptors: 100
Store Disk files open: 0

The FDs are still increasing steadily, but a LOT less.

On the other hand, the "free" RAM went from 2GB yesterday to just 275MB today:

Ouch, but kind of expected with those FD number increase. Each client 
connection will use about 3 FD, and two of them will use ~70KB each and 
the third will use some multiple of your avg object size.

Which reminds me ... Are you using SSL-Bump? if so ensure that you have 
configured "sslflags=NO_DEFAULT_CA" on the port lines. The default 
Trusted CA set can add a huge amount of useless memory to each client 
connection, which can add up to many GB quite quickly.

# free --mega
total used free shared buff/cache available
Mem: 32865 8685 275 157 23904 23683
Swap: 37036 286 36750

Used swap is still low enough (unchanged actually), so I guess I don't need to worry about it.

Nod, until the RAM runs out entirely, then problems are definitely to be 
expected and that sounds like it is your problem now.

FYI: The first side-effect of RAM swapping is that Squid starts slowing 
down on completed transactions - memory cache hits slow by 3-6 orders of 
magnitude when swapping in/out from disk, and any I/O buffers swapped 
out get 1+ speed penalties for the swap I/O time.
 That all leads to more active client connections overlapping their 
active periods (thus more FD usage total), and also clients start 
opening more connections to get better service from parallel fetches. So 
FD usage gets growth from two directions simultaneously.
 Which is all in a feedback loop since that extra memory pressure from 
more FD slows all transactions even further ... until the machine goes 
crunch. Maybe a literal crunch - I actually physically blew up a test 
machine (fire and smoke puring out the back!) measuring the effects of 
RAID on a overloaded proxy about a decade ago.

However, I'm bound to have issues when the "free" mem reaches 0... and I bet it will eventually.
That's when the double-shutdown trick will kick in.

I'll review the perl helper code, or maybe just switch to ufdbGuard.

Thanks,

Vieri
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users