big proplems with routing performance (2.4; 2.6) compared to 2.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

we use linux-pc as routers to connect the networks of our student halls to the 
internet since serveral years know. Recently we started to use kernel 2.4 
(and 2.6) instead of 2.0 for this task. We switch mainly because we need some 
policy-routing or because we wanted to enable our students to set up QoS.

But we must detect that 2.4.24 (and earlier) as well as 2.6.1 route much 
slower then 2.0 on most of our hardware. Dramatically on a K6-200, less so on 
a duron 900. The cpu-usage is very high (ksoftirqd_CPU0) even if the network 
traffic is rather low (10MBit/s consisting mostly of large packets).

We don't habe a lot of routes (only about 5-20) nor seems netfilter to be the 
problem.

As far as I can see the routing cache in combination with the dst-cache seems 
to be the problem. There are a lot of host in the munic-science-network 
(including our student networks) which suffer from W32Blaster and derivats. 
Though these worms do not make a lot of traffic per se, they try to contact 
(udp, some also per icmp echo-request) all ip-adresses.

Without these hosts the routing cache fills up very fast and holds serveral 
thousand entries but routing remains stable and cpu usage acceptable.

But with this erratic traffic the main work of the router seems to be running 
the GC of the routing cache and the GC of the dst-cache. As soon es there are 
2 or 3 of such hosts pinging through the router gets from 1ms to 100ms to 
160ms and cpu usage up to 80-90% (K6-200, intel e100 pro cards) even if there 
is almost no other traffic. RAM uage is no problem at all.

I played with the diverse knobs of the routing cache, setting
	route/max_size = 512,
	route/gc_thresh = 256,
	route/gc_min_interval = 0
seem to give the best results with about 80-100ms.

With an duron 900 it is much better, but even then we easily reach 20-30ms 
(compared with 1-2ms with kernel 2.0).

I tried to deaktivate the routing cache by changing rt_intern_hash so that it 
does not add anything to the cache (and insteads calls rt_free(rt) at the 
end). This helps a lot against those erratic src/dst traffic: if 1-4 of such 
hosts "attack" and now other traffic occurs the pings remain in the range of 
some seconds. Of course this is no real solution because now the dst-cache GC 
behaves very ugly if there are more pakets in the router waiting for 
transmission because for every dst there is a packet in his dst_garbage_list 
with __refcnt>0.

I now believe that it is not the routing cache per-se which is the problem in 
this situation. Problem seems that it is the connection between rt-cache, 
dst-cache and the packets waiting for delivery.

I would like to investigate this situation further but I don't know exactly 
what I should try next.
Personally I think the best thing to try out would be to get rid of the GC of 
dst-cache. Probably by not using an dst-entry ever more then by one user so 
it can be freed when dst_release is called? Is this easily to do?

Would be thankfull for any hints.

Greetings
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts
EDV
Leopoldstraße 15
80802 München
http://www.studentenwerk.mhn.de/

-
: send the line "unsubscribe linux-net" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux