Re: SetEnvIf and exceptions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have an apache-2.4.56 install on fedora37 and trying to block some bots from accessing the site, unless they're trying to access our RSS feeds. How can I do this?

I'm blocking the bots with SetEnvIF lines in the .htacess file in the document root like:

    SetEnvIf user-agent "(?i:libwww)" stayout=1
    deny from env=stayout
    <RequireAll>
       Require all granted
       Require not env stayout
    </RequireAll>

However, creating an entry that explicitly allows access to the XML files before or after doesn't seem to take effect:

    RewriteRule linuxsecurity_features\.xml$ - [L]

It is still blocked by the user-agent setting above. I understood the file was processed from the top down, and when a match is made, it stops processing. Is that not the case? Shouldn't the RewriteRule above, if placed before the env rule, be enough to stop processing the htaccess file and allow access?


The [L] flag only stops later RewriteRule directives from being processed.

Every module still gets its configuration merged from every matching config context, then decides what to do with its configuration when passed control at various times.

setenvif is processed very early, so if you can stay with it for manipulating this variable it will be much more intuitive
 

I've also tried adding these RewriteRule entries to the server config htaccess with an Include, but it appears the .htaccess in the document root is always processed afterwards, even after finding match in the server config htaccess.


I'd suggest the following:

1. Ditch the "deny", requireall, and require all granted leaving just "Require not env stayout"
2. Ditch the RewriteRule and do a second SetEnvIf for the exception (SetEnvIf  Request_URI linuxsecurity_features\.xml$ !stayout"

This doesn't fix it, assuming I'm implementing it as you've described. Removing the RequireAll section produces a site-wide 500 error in error_log:

.htaccess: negative Require directive has no effect in <RequireAny> directive

SetEnvIf user-agent "(?i:libwww-perl)" stayout=1
SetEnvIf Request_URI ^linuxsecurity_features\.*$ !stayout
RewriteRule linuxsecurity_features\.xml$ - [L]

198.74.49.155 - - [10/Apr/2023:10:32:33 -0400] "GET /linuxsecurity_features.xml HTTP/1.1" 403 199 "-" "LWP::Simple/6.00 libwww-perl/6.05" X:"SAMEORIGIN" 0/9629 979/8868/199 H:HTTP/1.1

This is all designed to prevent bots from being able to easily mirror our website. Even though I understand individuals could just change their user agent, sites like yandex/Acunetix and other services won't.

dave






[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux