* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Tue, 24 Jul 2007, Ingo Molnar wrote: > > > > please try the patch below instead. > > I'm hoping this is just a "let's see if the behavior changes" patch, > not something that you think should be applied if it fixes something? > > This patch looks like it is trying to paper over (rather than fix) > some possible bug in the "->disable" logic. Makes sense as a "let's > see if it's that" kind of thing, but not as a "let's fix it". > > Or am I missing something? yeah - it's a totaly bad and unacceptable hack (i realized how bad it was when i wrote up that comment section ...), i just wanted to see which portion of ne2k/lib8390.c is sensitive to the fact whether an irq line is masked or not. The patch has no SOB line either. the current best fix forward is to undo my original change, unless we find a better fix for this problem. (Note that the other patches posted in this thread are broken too: they only mask the irq but dont reliably unmask it.) here's the current method of handling irqs for Marcin's card: 17: 12 IO-APIC-fasteoi eth1, eth0 and fasteoi is a really simple sequence: no masking/unmasking by the flow handler itself but a NOP at entry and an APIC-EOI at the end. The disable/enable irq thing should thus have minimal effect if done within an irq handler. now ne2k does something uncommon: for xmit (which is normally done outside of irq handlers) it will disable_irq_nosync()/enable_irq() the interrupt. It does it to exclude the handler from _that_ CPU, but due to the _nosync it does not exclude it from any other CPUs. So that's a bit weird already. just in case, i've just re-checked all the genirq bits that change IRQ_DISABLED (that bit accidentally clear would be the only way to truly allow an IRQ handler to interrupt the disable_irq_nosync() critical section on that CPU) - but i can see no way for that to happen: we unconditionally detect and report unbalanced and underflowing irq_desc->depth, and the only other place (besides enable_irq()) that clears IRQ_DISABLED is __set_irq_handler(), and on x86 that is only used during bootup. Marcin, could you try the patch below too? [without having any other patch applied.] It basically turns the critical section into an irqs-off critical section and thus checks whether your problem is related to that particular area of code. Ingo Index: linux/drivers/net/lib8390.c =================================================================== --- linux.orig/drivers/net/lib8390.c +++ linux/drivers/net/lib8390.c @@ -297,9 +297,7 @@ static int ei_start_xmit(struct sk_buff * Slow phase with lock held. */ - disable_irq_nosync_lockdep_irqsave(dev->irq, &flags); - - spin_lock(&ei_local->page_lock); + spin_lock_irqsave(&ei_local->page_lock, flags); ei_local->irqlock = 1; @@ -376,8 +374,7 @@ static int ei_start_xmit(struct sk_buff ei_local->irqlock = 0; ei_outb_p(ENISR_ALL, e8390_base + EN0_IMR); - spin_unlock(&ei_local->page_lock); - enable_irq_lockdep_irqrestore(dev->irq, &flags); + spin_unlock_irqrestore(&ei_local->page_lock, flags); dev_kfree_skb (skb); ei_local->stat.tx_bytes += send_length; - To unsubscribe from this list: send the line "unsubscribe linux-net" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html