On Thu, May 22, 2003 at 03:33:30PM -0700, David S. Miller wrote: > If you'd like I can try to regenerate a profile, but you probably > already know what it will look like. > > I obviously know some things that will change, but I am still > very much interested in new profiles. Sorry for the delay -- I was away for a few days. Here are profile results from the same machine (still with XT-PIC), the same 300000 route entries, and your original patch that fixes the hashing. I should also mention that in all of these tests I have one filter rule in the INPUT chain (after routing) to avoid sending back zillions of ICMP packets out to the spoofed source IPs. ... 27 check_pgt_cache 0.8438 1430 ip_rcv_finish 2.4870 135 ipv4_dst_destroy 2.8125 357 cpu_idle 3.1875 7714 ip_route_input_slow 3.3481 434 fib_rules_policy 3.8750 2952 ip_rcv 5.2714 85 kmem_cache_alloc 5.3125 2188 netif_receive_skb 5.4700 2734 alloc_skb 5.6958 822 skb_release_data 5.7083 2161 __kfree_skb 5.8723 572 ip_local_deliver 5.9583 1023 __constant_c_and_count_memset 6.3937 3801 fib_validate_source 6.7875 6778 rt_garbage_collect 7.1801 497 __fib_res_prefsrc 7.7656 3035 inet_select_addr 8.2473 2717 tcp_match 8.4906 552 ipt_hook 8.6250 706 kmalloc 8.8250 1561 kfree 8.8693 1287 jhash_3words 8.9375 5937 nf_hook_slow 10.9136 2532 fib_semantic_match 12.1731 2356 eth_type_trans 12.2708 2166 nf_iterate 12.3068 4446 net_rx_action 12.6307 1622 kfree_skbmem 12.6719 842 rt_hash_code 13.1562 16030 ipt_do_table 14.5199 2104 tg3_recycle_rx 14.6111 13795 tg3_rx 14.6133 5667 __kmem_cache_alloc 17.7094 1193 ipt_route_hook 18.6406 2851 do_gettimeofday 19.7986 7423 fib_lookup 23.1969 1497 fib_rule_put 23.3906 8803 ip_packet_match 26.1994 4970 dst_destroy 28.2386 22479 rt_intern_hash 29.2695 8804 kmem_cache_free 55.0250 8380 dst_alloc 58.1944 18252 fn_hash_lookup 63.3750 25473 tg3_interrupt 75.8125 24036 do_softirq 100.1500 51355 ip_route_input 118.8773 57304 tg3_poll 188.5000 111691 handle_IRQ_event 698.0688 168828 default_idle 2637.9375 Full profile output available here: http://blue.netnation.com/sim/ref/ readprofile.full_route_table_hash_fixed.* Note that if I increase the packet rate and NAPI kicks in, all of the handle_IRQ and similar overhead basically disappears because it no longer uses IRQs. Pretty spiffy. Here is a profile of that: ... 25 tasklet_hi_action 0.1562 46 timer_bh 0.2054 97 net_rx_action 0.2756 93 tg3_vlan_rx 0.3875 158 tg3_poll 0.5197 1630 ip_rcv_finish 2.8348 142 ipv4_dst_destroy 2.9583 429 fib_rules_policy 3.8304 8959 ip_route_input_slow 3.8885 2438 ip_rcv 4.3536 2504 alloc_skb 5.2167 1991 __kfree_skb 5.4103 2279 netif_receive_skb 5.6975 929 skb_release_data 6.4514 669 ip_local_deliver 6.9688 1175 __constant_c_and_count_memset 7.3438 2367 tcp_match 7.3969 124 kmem_cache_alloc 7.7500 4535 fib_validate_source 8.0982 598 __fib_res_prefsrc 9.3438 8896 rt_garbage_collect 9.4237 3582 inet_select_addr 9.7337 1747 kfree 9.9261 717 ipt_hook 11.2031 938 kmalloc 11.7250 1747 jhash_3words 12.1319 6879 nf_hook_slow 12.6452 2439 eth_type_trans 12.7031 1695 kfree_skbmem 13.2422 2358 nf_iterate 13.3977 872 rt_hash_code 13.6250 2933 fib_semantic_match 14.1010 16553 ipt_do_table 14.9937 15339 tg3_rx 16.2489 2482 tg3_recycle_rx 17.2361 5967 __kmem_cache_alloc 18.6469 1237 ipt_route_hook 19.3281 3120 do_gettimeofday 21.6667 8299 ip_packet_match 24.6994 8031 fib_lookup 25.0969 1877 fib_rule_put 29.3281 6088 dst_destroy 34.5909 26833 rt_intern_hash 34.9388 10666 kmem_cache_free 66.6625 20193 fn_hash_lookup 70.1146 10516 dst_alloc 73.0278 64803 ip_route_input 150.0069 Full profile output available as: readprofile.full_route_table_hash_fixed_napi.* Hmm.. I see there is some redundant hashing going on in ip_route_input_slow() (called only from ip_route_input() which already calculates the hash), but my patch to fix that adds yet another argument to ip_route_slow() which isn't that pretty. It looks like that function isn't using much CPU anyway. Why is ip_route_input() so heavy still? This kernel is compiled CONFIG_SMP which makes the read_lock() calls actually do something, but it looks like they should be fairly light. Should I add an iteration counter to the for loop, perhaps? Simon- - : send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html