Re: Block IP

André Warnier <aw@xxxxxxxxxx> · Fri, 06 Jun 2008 23:57:20 +0200

Chris Tankersley wrote:

Mohit Anchlia wrote:

On 6/6/08, *André Warnier* <aw@xxxxxxxxxx <mailto:aw@xxxxxxxxxx>> wrote:

    Mohit Anchlia wrote:

        On 6/5/08, André Warnier <aw@xxxxxxxxxx <mailto:aw@xxxxxxxxxx>>
        wrote:

            Mohit Anchlia wrote:

                On 6/5/08, André Warnier <aw@xxxxxxxxxx
                <mailto:aw@xxxxxxxxxx>> wrote:

                    Mohit Anchlia wrote:

                    On 6/5/08, André Warnier <aw@xxxxxxxxxx
                    <mailto:aw@xxxxxxxxxx>> wrote:

                            Mohit Anchlia wrote:

                            On 6/4/08, Dragon <dragon@xxxxxxxxxxxxxxxxxx
                            <mailto:dragon@xxxxxxxxxxxxxxxxxx>> wrote:

                                André Warnier wrote:

                                    Mohit Anchlia wrote:

                                    2. Another question I had was
                                    sometimes we don't get real physical IP

                                        of

                                        the

                                            machine but the IP of
                                            something that's in between
                                            like "router", is
                                            there
                                            a
                                            way to get the real IP so
                                            that we don't end up
                                            blocking people
                                            coming
                                            from
                                            that "router" or "proxy"

                                            In my opinion, you cannot.
                                             The whole point of such
                                            routers and
                                            proxies

                                            is

                                        to make the requests look like
                                        they are coming from the
                                        router/proxy,
                                        so
                                        that is the sender IP address
                                        you are seeing at your server level,
                                        and
                                        that's it.  Your server never
                                        receives the original requester IP
                                        address.

                                        ---------------- End original
                                        message. ---------------------

                                        There are legitimate reasons for
                                        this to be done as well,

                                    indiscriminately
                                    blocking such access is a bad idea
                                    as it will affect legitimate
                                    users.
                                    NAT
                                    and IP address sharing are among the
                                    reasons. This allows an
                                    organization
                                    to
                                    have a router with one public IP
                                    address to serve a larger internal
                                    network
                                    with private IP addresses. Without
                                    this, we would have run out of
                                    IPv4
                                    addresses a long time ago.

                                    Dragon

                                    If there is no way to get the real
                                    IP address then how would router

                                know
                                which machine to direct the response to.
                                It got to have some
                                information
                                in
                                the packet. For eg: If A send to router
                                B and router sends to C then
                                when
                                C
                                responds how would B know that the
                                response is for A.

                                You are perfectly right : the router
                                knows the real IP address.  But
                                it

                                will not tell you, haha.

                            Seriously, this is how it works :
                            the original system sends out an "open
                            session" packet, through the
                            router,
                            to the final destination.
                            The router sees this packet, and analyses
                            it.  It extracts the IP
                            address
                            and port of the original sender, and keeps
                            it in a table.
                            Then it replaces the IP address by it's own,
                            adds some port number, and
                            also memorises this new port number in the
                            same table entry.
                            Then it sends the modified packet to the
                            external server (yours).
                            It knows that the server on the other side
                            is going to respond to this
                            same
                            IP address and port (the ones of the router).
                            When the return packet from the server comes
                            back, the router looks at
                            the
                            port in it, finds the corresponding entry in
                            it's table, and now it
                            knows
                            to
                            whom it should send the packet internally.
                            And so on.
                            So :
                            - the router knows everything
                            - the internal system thinks it is talking
                            directly to the external
                            server
                            - the external server (yours) only sees the
                            router IP and port, so it
                            thinks that is where the packet comes from.

                            That's NAT for you, in a nutshell.

                            Yes ?

                            ---

                            Thanks for the great explanation. But, I
                            wonder how do people design

                        app
                        agains Denial of Service attack. Say Computer A
                        uses Cox/Times warner
                        (cable) Internet connection and starts attacking
                        B, then how would a
                        system be configured in a way that not all the
                        users using Times
                        Warner/Cox
                        are affected. Should it be granular enough to
                        give IP and source Port in
                        IP
                        blocking rules ?

                        I think that is quite a different case.  Not all
                        users of an ISP (like

                    the
                    one you mention I suppose) are "behind" a NAT router
                    that hides their IP
                    address.  Instead, these ISP's have a large pool of
                    public IP addresses
                    which they "own", and they attribute them
                    dynamically to users when they
                    connect (and put the address back in the pool when
                    the user disconnects).

                    If a DOS attack came from a router with a fixed IP
                    address, and everyone
                    would know that this IP address belongs to company
                    xyz, I'm sure that it
                    would not be long before company xyz would be facing
                    a big lawsuit.

                    But in the case of an ISP, with tens of thousands of
                    customers, each one
                    of
                    which gets a different IP address each time he turns
                    on his computer (and
                    anyway once per 24 hours in general), finding out
                    who exactly was "
                    a234d-45hjk-dialin-atlanta.cox-t-warner.net
                    <http://a234d-45hjk-dialin-atlanta.cox-t-warner.net/>"
                    between 17:45 and 17:53
                    yesterday is a bit more time-consuming.

                    But in that case anyway, you do have a real
                    individual sender IP address
                    when the packet reaches your server, so you can
                    decide to block it.
                    And keep blocking all packets from this address for
                    the next 24 hours.
                    And that's exactly what many servers do.
                    And that is also why sometimes you may turn on your
                    PC at home (getting a
                    brand-new IP address) and find out that you cannot
                    connect to some server
                    because it is rejecting your IP address.  Chances
                    are that you are
                    unlucky
                    enough to have received today the IP address that
                    was used yesterday by
                    someone else who used it to send out 1M emails.

                    But isn't this getting a bit off-topic ?
                    If you want to know more about this, I suggest you
                    Google a bit on
                    "blacklists", "greylists" and "whitelists" for example.
                    or start here : http://en.wikipedia.org/wiki/DNSBL

                 Thanks ..it did go off-track a little bit and but it
                helps me understand
                what I should expect when doing such a blocking. Thanks
                for your
                explanation.

                Now coming back on track, out of below 2 approaches
                which one is better:

                1. Use "deny from IP" in <LocationMatch>
                2. Use RewriteCond and call a perl script dynamically.
                This helps me
                configure IP dynamically without having to stop and
                start servers
                everytime
                I change httpd.conf

                Is there any performance impact of using 2 over 1 or any
                other issues.

            There will be a very big difference : in case (1), the IP
            addresses or
            ranges are pre-processed by Apache at startup time, and the
            comparison will
            be made by an internal (and fast) Apache module, on the base
            of information
            in memory.  In case (2), not only are you using a rewrite of
            the URI, but in
            addition you will be executing a script, which itself is
            going to read an
            external file.  That is going to be several hundred times
            slower, at least.
             Thousands of times slower if you recompile and execute the
            script with perl
            each time (if not under mod_perl).
            Now wether it matters or not in your case, depends on the
            load of your
            server. If it is doing nothing anyway 90% of the time, it
            doesn't matter.
             An Apache restart may or may not be such a big problem
            either, it all
            depends on your circumstances.

            But rather than using a perl script, I would definitely in
            that case use a
            mod_perl add-on module written as a PerlAccessHandler.  But
            that's another
            story, and one more for the mod_perl list.
            I would bet that there exists already such a mod_perl module
            by the way.
            Have a look here :
            http://cpan.uwinnipeg.ca/search?query=apache2&mode=dist
            <http://cpan.uwinnipeg.ca/search?query=apache2&mode=dist>
            or, there is probably an example in the Mod_perl Cookbook

        As per your suggestion I looked at PerlAccessHandler, how would this
        approach be in terms of performance as compared to have "deny
        from IP", is
        it still going to be really bad.
         <Location /URL>
           PerlAccessHandler Example::AccessHandler
         </Location>
        I will try running some test also.

    Well again, it all depends on your circumstances, what you want to
    achieve, how many accesses you expect, why exactly you want to block
    or allow some IPs, how many different IP's or IP ranges you would
    want to allow/block, how often they change, in function of what they
    change, whether it is a big problem or not for you to do an Apache
    restart, how loaded your system is expected to be, etc..
    Even if one solution looks like it is 200 times slower than another,
    but your server is only loaded at 10% (happens more frequently than
    you would think), and it really makes your life easier for the next
    3 years, it's worth looking at.
    And even if one solution is 200 times slower than another, that can
    still mean 0,1 millisecond, so is it important for you ?

    A simple tip :
    in the Apache configuration file, you can use an "include"
    directive, I believe just about anywhere, to insert at that point
    another bit of configuration file.
    You could have a simple text file containing all your
    Deny from *MailScanner warning: numerical links are often
    malicious:* 1.2.3.4 <http://1.2.3.4/>
    Deny from *MailScanner warning: numerical links are often
    malicious:* 2.3.4.5 <http://2.3.4.5/>
    ...
    lines, and include it wherever you want.
    Then a simple Apache restart would re-read it.
    A this file could be written and re-written by some external script
    which decides which IPs are allowed or not. Or edited with vi
    manually, if that is how often changes happen.

    If you have a PerlAccessHandler under mod_perl :
    - perl itself is part of the server, so it does not have to be
    reloaded each time
    - the handler gets compiled once the first time it is run, and the
    compiled code is re-used afterward
    - it can be smart, and only re-read the IP address list, and rebuild
    its internal table when the file changes
    - and in the meantime, it uses the table in memory
    So in that case you would not have to restart Apache, and any
    changes would take effect immediately.

    Also, something else :
    So far, you have been talking about blocking HTTP accesses at the
    Apache level. But maybe you want to block more than port 80 from
    those IP addresses, and maybe you should do this outside of Apache,
    before it even gets to Apache ?

    There are many solutions, but you are the one to decide which one

    you implement. 

Thanks. You are right we should not even let these people get to Apache.

We have that process in place, but it often takes time to get that
request approved and processed by Network team. Meanwhile we want
something that we can block on ASAP. I am not sure how often this list
will change. To begin with this list is going to be empty. Only when we
experience DOS then we will update the IP.

We expect to get 1000s of requests per second. Since it's going to be

highly loaded server I started to think about something that would
change dynamically. You mentioned the code is compiled when apache
restarts, which means that if I keep list of IPs as an array inside the
perl script is not going to take affect until next restart. Only option
I think then is to read the list from flat file. I just have one basic
question about mod_perl. Does apache web server executes one process of
perl per request ? Reason I am asking is because you mentioned I could
read the list from memory, and I am not sure how would it read from
memory when this script will be executed every time it tries to process
the request. Because if I try to read from file then every request will
try to open the file and read from it. It looks like a stateless.

Thanks for detailed explanation. It does clear lot of things and also is

giving me different view points. Include directive was a great tip that
I wasn't aware of.

This does go off-topic, but why not use an external program to manage
all this for you at the OS level? On *nix, OSSEC (which is free) can
watch logs for 404 errors and dynamically block IP's at the OS level via
what they term 'active response'. Once blocked, those IPs don't even
make it to Apache and its all done dynamically. In my experience, OSSEC
will block an abusive user within just a few seconds of Apache writing
it to the logs.

For Windows, I'm sure there is software to block IP's at the OS level.

Sometimes, the punch-line is a long time in coming :
"We expect to get 1000s of requests per second. "

I do not really have much experience with that kind of volume (ok, I 

admit, none).  But in such a case, my instinctive reaction would be to 

think about a solution outside of the httpd server, before it even 

starts consuming httpd server cpu cycles.  Maybe even outside of the 

httpd host, before it starts consuming host cpu cycles.

If one has to log 1000's of requests/s, and have some process scan this 

log to determine if some IP's need to be blocked, I have this hunch that 

it's not going to work nicely.

So, what about a nice little diskless box in front, doing just that, and 

save the Apache resources to serve nice pages to the nice customers ?

Now just by curiosity Mohit, what kind of site are you setting up there 

?  And on which kind of system ?

André

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
  "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx