Search squid archive

Re: Re: squid 3.2.0.14 with TPROXY => commBind: Cannot bind socket FD 773 to xxx.xxx.xxx.xx: (98) Address

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey,

it can be tested in a matter of minutes.
If we have some test candidate I will write a small tproxy script to
verify the suspect.

Eliezer

On 09/14/2013 07:39 PM, Nikolai Gorchilov wrote:
> Hi, Eliezer,
> 
> On Tue, Sep 10, 2013 at 1:49 AM, Eliezer Croitoru <eliezer@xxxxxxxxxxxx> wrote:
>> Hey Nickolai,
>>
>> I would try to make sense of what you have seen.
>> The tproxy is a very complex feature which by the kernel cannot bind
>> double src(ip:port) + dst(ip:port)..
>> like let say for example the 10.100.1.100 client tries to connect
>> 2.3.4.5 at port 80.
>> the client tries once for:
>> 10.100.1.100:5455 to 2.3.4.5:80
>> then let say the client doesn't have the right route and there is a
>> network problem then the client tries again from:
>> 10.100.1.100:5456 to 2.3.4.5:80
>> the above client have an issue with the network and the proxy knows that..
>> the proxy is transparent and needs to re-intercept the same request
>> twice.. and when the first connection was timedout from the kernel level
>> then application can drop the connection and do not continue parsing the
>> request.
> 
> The problem I'm facing is not related to user to proxy connection at
> all. With proper network setup this works flawlessly.
> 
> It's the proxy to server connection when squid tries to bind to an IP,
> without specifying a port, thus leaving the kernel to choose one.
> 
>> the kernel can bind the ip:port of the src to the dst if it knows that
>> all 80 port traffic is using only the traffic as a route.
>> in a case this is not the case the client will have troubles and hence a
>> binding of ip:port to ip:port from the network layer will be a disaster
>> for couple layers..
> 
> Yeah! ip:port pairs have to be unique :-)
> 
>> SO the kernel manages what the bind will be like..
>> I dont see how a tproxy enabled system for more then 10,000 cilents can
>> reach a critical level of commbind unless the cpu and all the lower
>> levels of the kernel will not be able to handle this level of traffic.
> 
> It's not about number of users, but number of simultaneous live
> connections from the cache server. Have in mind "idle" http
> connections are "live" tcp streams.
> 
>> if it's the range thing from the kernel it can be reproduced in a matter
>> of seconds by lowering it..
> 
> Exactly. Try something like echo 32768 32867 >
> /proc/sys/net/ipv4/ip_local_port_range and you'll start getting
> EADDRINUSE on the 101st parallel outbound connection of squid.
> 
>> This limit is not a rule for the application but it limits the kernel to
>> what local-ip:port bind when the source machine is the local machine.
>> this doesn't force the kernel to handle lower amount of connections but
>> allows the kernel to do less lookup when trying to find a free ip:port
>> socket to bind to the new connection.
>>
>> it seems to me like you are using connection tracking on a tproxy system
>> that doesn't need to do connection tracking at all in this kind of scale..
>> There is no reason for a tproxy system to keep track on connections of
>> the client for more then 5-10 minutes tops..
>>
>> try to look more into the connection tracking rather then the basic
>> kernel lands..
> 
> Nope. The problem has nothing to do with TPROXY, nor connection
> tracking. It's in the port auto-selection algorithm of the kernel that
> limits the number of live auto-selected ports to
> ip_local_port_range.max - ip_local_port_range.min.
> 
> Here's some pseudocode to reproduce it, even with local addresses
> assigned to the host:
> 
> ===[cut]===
> $broken = true; // ask the kernel to select port
> $port_min = ip_local_port_range.min;
> $port_max = ip_local_port_range.max;
> $ips_to_test_with = {'aaa.aaa.aaa.aaa', 'bbb.bbb.bbb.bbb');
> 
> function socket_setup($ip, $port) {
>     $socket = new socket(AF_INET, SOCK_STREAM, SOL_TCP);
>     $socket.set_option(SOL_SOCKET, SO_REUSEADDR, 1);
>     $socket.set_option(SOL_IP, IP_TRANSPARENT, 1); // needed only if
> $ips_to_test_with are not assigned to the host
>     $socket.bind($ip, $port);
>     $socket.listen(); // listen is easier and faster for testing, we
> have to just block this socket in the kernel somehow. in the real life
> it will be a $socket.connect.
>     return $socket;
> }
> 
> for ($port = $socket_min; $port <= $socket_max; $port++) {
>     foreach ($ips_to_test_with as $ip) {
>         if ($broken) {
>              // will produce exception when $port = floor(($socket_max
> - $socket_max) / count($ips_to_test_with)) +1
>              socket_setup($ip, 0);
>         } else {
>             // will assign all the ports
>             socket_setup($ip, $port);
>         }
>     }
> }
> 
> ===[cut]===
> 
> That's it. Do echo 32768 32867 >
> /proc/sys/net/ipv4/ip_local_port_range  in try it. Once with $broken =
> true, and then again with $broken = false.
> 
> When $broken = true on the 51st port assignment on IP address
> aaa.aaa.aaa.aaa you'll get EADDRINUSE.
> When $broken = false you'll get both aaa.aaa.aaa.aaa and
> bbb.bbb.bbb.bbb listening to 100 ports each and no error.
> 
> Hope this time it's more clear.
> 
> Best,
> Niki
> 





[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux