Hi, I hope this is the right list for this. I recently upgraded from 2.4.20 to 2.4.33 both with the ebtables patch running an SMP kernel on a UP box and now I have the following problem: My box has some rules in ebtables and iptables and a bridge with 3 ports. When I try to remove eth2, which is connected to other bridges, from it the box hangs. I used KDB to catch the trace and got this: nf_hook_slow+0x75 [bridge]br_send_bpdu+0x19f [bridge]br_send_config_bpdu+0x189 [bridge]br_transmit_config+0xca [bridge]br_config_bpdu_generation+0x45 [bridge]br_become_root_bridge+0x48 [bridge]br_stp_disable_port+0x9a [bridge]__br_del_if+0x3e [bridge]br_del_if+0x47 [bridge]br_ioctl_device+0x59 [bridge]br_ioctl+0x66 [bridge]br_dev_do_ioctl+0x91 dev_ifsioc+0x420 dev_ioctl+0x262 inet_ioctl+0x1d3 sock_ioctl+0x3f sys_ioctl+0x104 system_call+0x33 I traced the problem to nf_hook_slow() trying to get a read lock on BR_NETPROTO_LOCK but br_del_if() already gets a write lock earlier in the stack. I also checked and in 2.4.20 br_send_bpdu() called dev_queue_xmit() directly and now it goes through netfilter. I wrote this small patch just to see what will happen: --- netfilter.c 2006-10-29 18:55:16.000000000 +0200 +++ netfilter.c.new 2006-10-29 18:55:09.000000000 +0200 @@ -486,7 +486,10 @@ } /* We may already have this, but read-locks nest anyway */ - br_read_lock_bh(BR_NETPROTO_LOCK); + if (spin_is_locked(&__br_write_locks[BR_NETPROTO_LOCK].lock)) + printk(KERN_ERR "nf_hook_slow: BR_NETPROTO_LOCK already locked.\n"); + else + br_read_lock_bh(BR_NETPROTO_LOCK); #ifdef CONFIG_NETFILTER_DEBUG if (unlikely((*pskb)->nf_debug & (1 << hook))) { @@ -509,7 +512,8 @@ nf_queue(*pskb, elem, pf, hook, indev, outdev, okfn); } - br_read_unlock_bh(BR_NETPROTO_LOCK); + if (!spin_is_locked(&__br_write_locks[BR_NETPROTO_LOCK].lock)) + br_read_unlock_bh(BR_NETPROTO_LOCK); return ret; } Now the kernel will not deadlock and everything seems ok except that when I used brctl to add the interface again it says it can't enslave the port because it already part of the bridge and if I try to delete it again it says that the port is not part of the bridge, but after about 40 seconds everything normal again and the interface is no longer part of the bridge. I was wondering if this patch is ok as a workaround for this problem or if there's a better solution. Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-net" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html