Re: System hard locks with bonding and tcpdump

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 19 Jun 2003, Hen, Shmulik wrote:

> ??????? The system will occasionally freeze. When using the -p option to
> *not* set the interface into promiscuous mode, every thing is OK. Using
> KDB we were able to conclude that this is probably a deadlock between the
> br lock and either dev->queue_lock or dev->xmit_lock (see trace below).
>
>
>
> Entering kdb (current=0xc9764000, pid 17303) on processor 2 due to cpu switch
> [2]kdb> bt
> 0xc9764000??? 17303????? 779? 1? 002? run?? 0xc9764370*tcpdump
> EBP??????? EIP??????? Function (args)
> 0xc9765e80 0xc01062fb __write_lock_failed+0x7 (0xc9765e98, 0xc021bb78, 0xdebfb520, 0xf72609a0, 0xc981b120)
> ?????????????????????????????? kernel .text 0xc0100000 0xc01062f4 0xc0106314
> ?????????? 0xc026025b .text.lock.brlock+0x5
> ?????????????????????????????? kernel .text 0xc0100000 0xc0260256 0xc0260260
>


	There seems to be more info regarding this bug that points to the
fact that this may be a kernel bug. A more comprehensive investigation
done by Tsippy Mendelson reveals the following details:

	The deadlock is not between the br lock and the dev locks, but
rather between different lock attempts done on the same br lock. Looking
at the transmit flow the TCP packet passes when tcpdump is running, it
looks as though nf_hook_slow() does a br_read_lock_bh(BR_NETPROTO_LOCK)
first, and later, further down the flow, dev_queue_xmit_nit() does a
br_read_lock(BR_NETPROTO_LOCK). In between, tcpdump tries to hold
BR_NETPROTO_LOCK for writing (as seen in the trace) and so, we get a write
lock waiting on a read lock, and another read lock waiting on the write
lock but is on the same CPU of the first lock - deadlock!

	The funny thing is that just above the place where the first lock
is held the following comment appears:
"We may already have this, but read-locks nest anyway"


Any thoughts/comments about what can be done ?


-- 
| Shmulik Hen                             |
| Israel Design Center (Jerusalem)        |
| LAN Access Division                     |
| Intel Communications Group, Intel corp. |


-
: send the line "unsubscribe linux-net" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux