>
> RewriteCond %{HTTP_USER_AGENT} ^$
> [OR]
> RewriteCond %{HTTP_USER_AGENT} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).*
> [NC,OR]
> RewriteCond %{HTTP_USER_AGENT}
> ^.*(HTTrack|clshttp|archiver|loader|email|nikto|miner|python).* [NC,OR]
> RewriteCond %{HTTP_USER_AGENT} ^.*(winhttp|libwww\-
> perl|curl|wget|harvest|scan|grab|extract).* [NC,OR]
> RewriteCond %{HTTP_USER_AGENT}
> ^.*(Googlebot|SemrushBot|PetalBot|Bytespider|bingbot).* [NC]
> RewriteRule (.*) https://guardiandigital.com/$1 [L,R=301]
>
>
> SetEnvIf user-agent "(?i:GoogleBot)" googlebot=1
> SetEnvIf user-agent "(?i:SemrushBot)" googlebot=1
> SetEnvIf user-agent "(?i:PetalBot)" googlebot=1
> SetEnvIf user-agent "(?i:Bytespider)" googlebot=1
> SetEnvIf user-agent "(?i:bingbot)" googlebot=1
>
>
> <RequireAny>
> Require ip 1.2.3.4
> Require env googlebot
> </RequireAny>
>
I would think that mod_security is more efficient for this
SecRule REQUEST_HEADERS:User-Agent "xxxx" "id:'13006',phase:2,log,deny,status:200"
Why allow SemrushBot, PetalBot and Bytespider? If they don't give you traffic, block them. Better add things for yandex and duckduckgo. Duckduckgo is getting better than google. Maybe start looking for ai crawlers also.
> I was also originally trying to associate the rewriterules with the
> requireany using <If> but then realized I didn't even have to do that -
> it would just automatically get processed independently. It looks so
> simple now, but took me a while to make it this simple.
>
>
What also helps is blocking these clouds, just get their ip ranges
- amazon
- googleusercontent
- digital ocean
- ovh
PS. Don't give google the credit to have bot variable named after them ;).
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx