Re: kernel 2.6.25-rc7 highly unstable on high load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Miller a écrit :
From: "Denys Fedoryshchenko" <denys@xxxxxxxxxxx>
Date: Thu, 27 Mar 2008 08:35:06 +0200

It seems i am having very bad luck with 2.6.27. As Linus told, it have to be released soon, but it is crashing like hell on high network load.

That's amazing, you've taken a trip into the future and are running
2.6.27 already, please let me borrow your time machine :-)

More seriously, there is obviously something very unique to your
setup or else everyone would be reporting this crash, and we have
to find out what that might be.

There seems to be bunch of netfilter stuff in your traces, but
the top of the trace is somewhere totally unrelated.  This is
a common reoccurance in your crash traces, making them less
useful than they could be.

I know you asked before what can be done to improve the traces,
but I'm not an x86 expert so I have no idea how to help you
in that area.

Patrick, could you see if you can make any sense of his log?
I see conttrack a lot in the backtraces.

I can see rt_garbage_collect() involved here. This one might explain very long delays in softirq processing, and eventually crashes...

Denys, could you post :

grep . /proc/sys/net/ipv4/route/*

rtstat -c1 -i10

So that we can check if you should first change route cache tunables :)


Thanks.

Here is a message i got over syslog on last crash (it was 2.6.25-rc6-git6), available also at http://www.nuclearcat.com/files/crash_2.6.25.txt

Mar 26 02:27:14 ROUTER [ 4698.694693] BUG: NMI Watchdog detected LOCKUP
Mar 26 02:27:14 ROUTER on CPU1, ip c02ad109, registers:
Mar 26 02:27:14 ROUTER [ 4698.694693] Process snmpd (pid: 2327, ti=c092e000 task=f7459080 task.ti=f70b7000) Mar 26 02:27:14 ROUTER Mar 26 02:27:14 ROUTER [ 4698.694693] Stack: Mar 26 02:27:14 ROUTER c092eb14 Mar 26 02:27:14 ROUTER c011991e Mar 26 02:27:14 ROUTER f750d600 Mar 26 02:27:14 ROUTER f750d600 Mar 26 02:27:14 ROUTER c0378058 Mar 26 02:27:14 ROUTER 00000001 Mar 26 02:27:14 ROUTER c092eb34 Mar 26 02:27:14 ROUTER c0119b3b Mar 26 02:27:14 ROUTER Mar 26 02:27:14 ROUTER [ 4698.694693] Mar 26 02:27:14 ROUTER 00000000 Mar 26 02:27:14 ROUTER 00000001 Mar 26 02:27:14 ROUTER 00000082 Mar 26 02:27:14 ROUTER f708af88 Mar 26 02:27:14 ROUTER c0378058 Mar 26 02:27:14 ROUTER 00000001 Mar 26 02:27:14 ROUTER c092eb3c Mar 26 02:27:14 ROUTER c0119bfe Mar 26 02:27:14 ROUTER Mar 26 02:27:14 ROUTER [ 4698.694693] Mar 26 02:27:14 ROUTER c092eb50 Mar 26 02:27:14 ROUTER c012f19c Mar 26 02:27:14 ROUTER 00000000 Mar 26 02:27:14 ROUTER f708af88 Mar 26 02:27:14 ROUTER c0378058 Mar 26 02:27:14 ROUTER c092eb74 Mar 26 02:27:14 ROUTER c011652a Mar 26 02:27:14 ROUTER 00000000 Mar 26 02:27:14 ROUTER Mar 26 02:27:14 ROUTER [ 4698.694693] Call Trace: Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011991e>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER task_rq_lock+0x31/0x58 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0119b3b>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER try_to_wake_up+0x19/0xd1 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0119bfe>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER default_wake_function+0xb/0xd Mar 26 02:27:14 ROUTER [ 4698.694693] [<c012f19c>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER autoremove_wake_function+0xf/0x33 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011652a>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER __wake_up_common+0x2f/0x5a Mar 26 02:27:14 ROUTER [ 4698.694693] [<c01189b8>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER __wake_up+0x28/0x3b Mar 26 02:27:14 ROUTER [ 4698.694693] [<c01201a3>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER wake_up_klogd+0x2e/0x31 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c012033d>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER release_console_sem+0x197/0x19f Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0120747>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER vprintk+0x295/0x2e5 Mar 26 02:27:14 ROUTER [ 4698.694693] [<f899634c>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER death_by_timeout+0x8b/0xa3 [nf_conntrack] Mar 26 02:27:14 ROUTER [ 4698.694693] [<f8999d08>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER tcp_packet+0x931/0x9e5 [nf_conntrack] Mar 26 02:27:14 ROUTER [ 4698.694693] [<c01207ac>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER printk+0x15/0x17 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011fb65>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER warn_on_slowpath+0x2a/0x51 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011764a>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER __update_rq_clock+0x1c/0x126 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0116ab3>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER update_curr+0x48/0x64 Mar 26 02:27:14 ROUTER [ 4698.694693] [<f89961ed>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER nf_ct_invert_tuple+0x63/0x6f [nf_conntrack] Mar 26 02:27:14 ROUTER [ 4698.694693] [<f8996cca>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER nf_conntrack_tuple_taken+0xf8/0x100 [nf_conntrack] Mar 26 02:27:14 ROUTER [ 4698.694693] [<f899850c>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER __nf_ct_helper_find+0x2c/0x90 [nf_conntrack] Mar 26 02:27:14 ROUTER [ 4698.694693] [<f8996b95>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER nf_conntrack_alter_reply+0x4a/0x87 [nf_conntrack] Mar 26 02:27:14 ROUTER [ 4698.694693] [<f8974976>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER nf_nat_setup_info+0x3cc/0x55a [nf_nat] Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011701c>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER dequeue_rt_entity+0x88/0x171 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0117127>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER dequeue_rt_stack+0x22/0x27 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0117425>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER enqueue_task_rt+0x19/0x2c Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011617f>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER enqueue_task+0xd/0x18 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c01161c0>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER activate_task+0x1e/0x2b Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0119bb1>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER try_to_wake_up+0x8f/0xd1 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0119c1b>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER wake_up_process+0xf/0x11 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c013dfa1>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER softlockup_tick+0x9d/0x10b Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0126f5c>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER run_local_timers+0x17/0x19 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c01272fa>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER update_process_times+0x24/0x49 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0135f4c>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER tick_periodic+0x62/0x6e Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0135f71>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER tick_handle_periodic+0x19/0x68 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c010e87b>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER smp_apic_timer_interrupt+0x6c/0x81 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0104344>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER apic_timer_interrupt+0x28/0x30 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c02ad202>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER _spin_lock_bh+0x20/0x22 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c02751fa>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER rt_garbage_collect+0x132/0x27a Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0262d95>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER dst_alloc+0x19/0x63 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0276eb1>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER ip_route_input+0x6b9/0xbd9 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0278898>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER ip_rcv_finish+0x2c/0x29a Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0278ef8>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER ip_rcv+0x202/0x22c Mar 26 02:27:14 ROUTER [ 4698.694693] [<c025ee4e>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER netif_receive_skb+0x33e/0x3a9 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c02612c2>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER process_backlog+0x62/0xb5 Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0260d27>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER net_rx_action+0x8f/0x191 Mar 26 02:27:14 ROUTER [ 4698.694694] [<c01240a7>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER __do_softirq+0x64/0xcd Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0105f0a>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER do_softirq+0x55/0x89 Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0123f88>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER local_bh_enable+0x61/0x6d Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0257689>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER lock_sock_nested+0x83/0x8b Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0292e58>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER udp_destroy_sock+0xd/0x20 Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0257b9e>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER sk_common_release+0x15/0x60 Mar 26 02:27:14 ROUTER [ 4698.694694] [<c02924a4>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER udp_lib_close+0x8/0xa Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0299006>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER inet_release+0x42/0x48 Mar 26 02:27:14 ROUTER [ 4698.694694] [<c025625b>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER sock_release+0x14/0x60 Mar 26 02:27:14 ROUTER [ 4698.694694] [<c02565d9>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER sock_close+0x29/0x30 Mar 26 02:27:14 ROUTER [ 4698.694694] [<c015a6a2>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER __fput+0x93/0x135 Mar 26 02:27:14 ROUTER [ 4698.694694] [<c015a8e2>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER fput+0x17/0x19 Mar 26 02:27:14 ROUTER [ 4698.694694] [<c01583dc>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER filp_close+0x47/0x51 Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0159414>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER sys_close+0x68/0x9d Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0103876>] Mar 26 02:27:14 ROUTER ? Mar 26 02:27:14 ROUTER sysenter_past_esp+0x5f/0x85
Mar 26 02:27:14 ROUTER [ 4698.694694]  =======================
Mar 26 02:27:14 ROUTER [ 4698.694694] Code: Mar 26 02:27:14 ROUTER 94 Mar 26 02:27:14 ROUTER c0 Mar 26 02:27:14 ROUTER 84 Mar 26 02:27:14 ROUTER c0 Mar 26 02:27:14 ROUTER b9 Mar 26 02:27:14 ROUTER 01 Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 75 Mar 26 02:27:14 ROUTER 09 Mar 26 02:27:14 ROUTER f0 Mar 26 02:27:14 ROUTER 81 Mar 26 02:27:14 ROUTER 02 Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 01 Mar 26 02:27:14 ROUTER 30 Mar 26 02:27:14 ROUTER c9 Mar 26 02:27:14 ROUTER 5d Mar 26 02:27:14 ROUTER 89 Mar 26 02:27:14 ROUTER c8 Mar 26 02:27:14 ROUTER c3 Mar 26 02:27:14 ROUTER 55 Mar 26 02:27:14 ROUTER ba Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 01 Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 89 Mar 26 02:27:14 ROUTER e5 Mar 26 02:27:14 ROUTER f0 Mar 26 02:27:14 ROUTER 66 Mar 26 02:27:14 ROUTER 0f Mar 26 02:27:14 ROUTER c1 Mar 26 02:27:14 ROUTER 10 Mar 26 02:27:14 ROUTER 38 Mar 26 02:27:14 ROUTER f2 Mar 26 02:27:14 ROUTER 74 Mar 26 02:27:14 ROUTER 06 Mar 26 02:27:14 ROUTER f3 Mar 26 02:27:14 ROUTER 90 Mar 26 02:27:14 ROUTER unparseable log message: "<8a> " Mar 26 02:27:14 ROUTER 10 Mar 26 02:27:14 ROUTER eb Mar 26 02:27:14 ROUTER f6 Mar 26 02:27:14 ROUTER 5d Mar 26 02:27:14 ROUTER c3 Mar 26 02:27:14 ROUTER 55 Mar 26 02:27:14 ROUTER 89 Mar 26 02:27:14 ROUTER e5 Mar 26 02:27:14 ROUTER f0 Mar 26 02:27:14 ROUTER 81 Mar 26 02:27:14 ROUTER 28 Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 00 Mar 26 02:27:14 ROUTER 01 Mar 26 02:27:14 ROUTER 74 Mar 26 02:27:14 ROUTER 05 Mar 26 02:27:14 ROUTER e8 Mar 26 02:27:14 ROUTER 64 Mar 26 02:27:14 ROUTER fd Mar 26 02:27:14 ROUTER
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux