Re: Complicate ACL affect performance?

Henrik K <hege@xxxxxxx> · Thu, 16 Oct 2008 12:02:43 +0300

On Thu, Oct 16, 2008 at 10:10:23AM +0200, Henrik Nordstrom wrote:
> On ons, 2008-10-15 at 17:14 +0300, Henrik K wrote:
> > > Avoid using regex based acls.
> > 
> > It's fine if you use Perl + Regexp::Assemble to optimize them. And link
> > Squid with PCRE. Sometimes you just need to block more specific URLs.
> 
> No it's not. Even optimized regexes is several orders of magnitude more
> complex to evaluate than the structured acls.
> 
> The lookup time of dstdomain is logaritmic to the number of entries.
> 
> The lookup time of regex acls is linear to the number of entries.

It's fine that you advocate for "avoid regex", but a much better way is to
actually tell people what's wrong and how to use them efficiently if needed.

Of course you shouldn't have a separate regex for every URL. I suggest you
look at what Regexp::Assemble does.

Optimizing 1000 x "www.foo.bar/<randomstuff>" into a _single_
"www.foobar.com/(r(egex|and(om)?)|fuba[rz])" regex is nowhere near linear.
Even if it's all random servers, there are only ~30 characters from which
branches are created from.