Re: Scaling concurrent TCP sessions beyond ephemeral port range

Alex Rousskov <rousskov@xxxxxxxxxxxxxxxxxxxxxxx> · Thu, 19 May 2022 22:18:38 -0400

On 5/19/22 20:22, Praveen Ponakanti wrote:

Does anyone have recommendations on scaling concurrent connections 
through the squid proxy to above the ephemeral port range?

I know of several solutions, but not all of them are probably applicable 
to your specific situation:

1. Decrease the amount of time closed TCP connections occupy the port. 
For example, if you have many connections in TIME_WAIT state, and can 
afford to lower that state duration, it may help free ports faster.

2. If outgoing connections are closed faster (i.e. after fewer requests) 
than they should be, then fix that problem to increase connection reuse 
(and, hence, decrease port pressure). This solution is usually 
applicable to environments where you control both ends of the connection 
and see some premature closures.

3. Use more outgoing IP addresses. Without modifications, Squid would 
not automatically pick the next outgoing IP address after using up most 
of the ports on the previous one, but perhaps the OS would do the right 
thing _for_ Squid? Not sure. You can use tcp_outgoing_address with 
random ACLs to force-spread the load across multiple IPs (and, hence, 
multiple port banks). This does not work if you must use a single 
outgoing IP for some reason.

4. Modify Squid to retry port binding errors. This is easy to do but 
(without #5 below) it will not help much once ephemeral ports become 
scarce (in my experience; I have not checked what the latest kernels are 
capable of in this area).

5. If you need, say, 20-30% more concurrency (rather than 100+%) and 
cannot use multiple outbound IP addresses, then would be possible to 
modify Squid to implement a manual port allocation algorithm that 
usually works a lot more reliable under load than ephemeral ports 
administered by the OS (last time I checked, which was a few years ago). 
You will still be bound by the TCP limit of 64K ports (minus whatever 
you want to leave for other applications that open outgoing connections) 
and various TCP-level timeouts, of course, but the number of cases where 
Squid cannot open a port because of OS port mismanagement will go down.

FWIW, we successfully use solutions 3, 4, and 5 in Web Polygraph 
benchmark (that can be configured to create lots of outgoing connections).

I have squid v5.5 on Ubuntu with about 48K ephemeral ports available 
with the ip_local_port_range. The squid is bound to listen on port 3128 
and has a single tcp_outgoing_address configured. We notice that after 
about 40-45k concurrent connections through the proxy it is unable to 
reuse ports and it severely limits local ports available to other 
applications running on the system. The squid is setup to run 30 
workers; total CPU is still under 10% during peak connection rates.

Is any build config flag required to enable SO_REUSEPORT or SO_REUSEADDR 
on the outbound TCP sessions opened by squid?

Squid can be configured to use SO_REUSEPORT on _incoming_ connections 
(see *_port worker-queues), but that is not what you are asking about. 
Outside of that worker-queues feature, Squid will not set SO_REUSEPORT 
AFAICT.

Squid does set SO_REUSEADDR unless you use the -R command line option 
AFAICT.

It does not appear that there is an option to use the 
IP_BIND_ADDRESS_NO_PORT sockopt flag which can help with ephemeral port 
reuse.

No.

We have tried enabling tcp_tw_reuse, ip_autobind_reuse and 
ip_nonlocal_bind flags, but unable to get the system reuse the ephemeral 
ports. The fs.file-max is set to 4M. Pasted some errors below. Any 
suggestions are appreciated!

HTH,

Alex.

2022/05/19 23:35:00 kid12| commBind Cannot bind socket FD 3075 to 
</IP/>: (99) Cannot assign requested address

current master transaction: master48536607

2022/05/19 23:35:00 kid23| commBind Cannot bind socket FD 1320 to 
</IP/>: (99) Cannot assign requested address

current master transaction: master26662366

2022/05/19 23:37:30 kid13| commBind Cannot bind socket FD 3346 to 
</IP/>: (98) Address already in use

current master transaction: master11976056

2022/05/19 23:37:30 kid12| commBind Cannot bind socket FD 6459 to 
</IP/>: (98) Address already in use

current master transaction: master48561031

While the system is in this state, local curl’s to another endpoint on 
the same node are not able to obtain a TCP socket.

curl: (7) Couldn't connect to server

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users