Search squid archive

Re: disallow caching based on agent

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 26 Oct 2010 16:34:52 -0400, alexus <alexus@xxxxxxxxx> wrote:
> On Mon, Oct 25, 2010 at 6:38 PM, Amos Jeffries <squid3@xxxxxxxxxxxxx>
> wrote:
>> On Mon, 25 Oct 2010 12:38:49 -0400, alexus <alexus@xxxxxxxxx> wrote:
>>> is there a way to disallow serving of pages based on browser (agent)?
>>> I'm getting a lot of these:
>>>
>>> XX.XX.XX.XX - - [25/Oct/2010:16:37:44 +0000] "GET
>>> http://www.google.com/gwt/x? HTTP/1.1" 200 2232 "-"
>>> "SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1
>>> UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible;
>>> Googlebot-Mobile/2.1; +http://www.google.com/bot.html)"
>>> TCP_MISS:DIRECT
>>
>> Of course. you may...
>>
>> http://www.squid-cache.org/Doc/config/cache
>>
>> Although you need to be aware that preventing one object caching
operates
>> by removing records to it after the transaction has finished. The
effect
>> of
>> doing this which you can expect is that a visit by GoogleBot will empty
>> your cache of most content.
>>
>> Amos
>>
> 
> I'm not sure what do you mean by that, it seems like I dont know how
> but my SQUID gets hit by different bots and I was thinking to somehow
> disallow access to them, so they dont hit me as hard... maybe it's a
> stupid way of dealing with things...

Ah. Not caching will make the impact worse. One of the things Squid offers
is reduced web server impact from visitors. Squid is front-line software.

 * Start with creating a robots.txt. The major bots will obey that and you
can restrict where they go and sometimes how often.

 * allowing caching of dynamic pages where possible with squid-2.6 and
later (http://wiki.squid-cache.org/ConfigExamples/DynamicContent). Squid
will handle the bots and normal visitors faster if it has cached content to
serve out immediately instead of waiting.

 * check your squid.conf for performance killers (regex, external
helpers), reduce the number of requests reaching those ACL tests as much as
possible. Squid routinely handles thousands of concurrent connections for
ISP so a visit by several bots at once should not really be any visible
load.


Amos


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux