On Mon, 15 Jun 2009 12:26:16 -0700 (PDT), hims92 <himanshu.singh.cse07@xxxxxxxxxxx> wrote: > Hi, > As far as I know, SquidGuard uses Berkeley DB (which is based on BTree and > Hash tables) for storing the urls and domains to be blocked. But I need to > store a huge amount of domains (about 7 millions) which are to be blocked. > Moreover, the search time to check if the domain is there in the block > list, > has to be less than a microsecond. > > So, Will Berkeley DB serve the purpose? > > I can search for a domain using PATRICIA Trie in less than 0.1 > microseconds. > So, if Berkeley Trie is not good enough, how can I use the Patricia Trie > instead of Berkeley DB in Squid to block the url. Do do tests with such a critical timing you would be best to use an internal ACL. Which eliminates networking transfer delays to external process. Are you fixed to a certain version of Squid? Squid-2 is not bad to tweak, but not very easy to add to ACL either. The Squid-3 ACL are fairly easy to implement and drop a new one in. You can create your own version of dstdomain and have Squid do the test. At present dstdomain uses unbalanced splay tree on full reverse-string matches which is good but not so good as it could be for large domain lists. If it scales well and is faster than the existing dstdomain it would be a welcome addition. Amos