Re: [PATCH 2/5] tg3: Fix std rx prod ring handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Great, I just spent a few hours bisecting this to find this patch
after googling for the offending commit. :)

So, I've hit this bug and yesterday finally found a repeatable
test-case after one time the BUG message actually managed to hit the
disk and I found out it was a null pointer dereference in
tg3_poll_work.

Other ensuing bad things were packets filled with 0x5a in tcpdump
and truncated packets in tcpdump on the device, usually followed by
a kernel panic shortly after that.

I've confirmed that 2.6.33-rc3-git4 with your patch applied on top
fixes this issue for me.

Jan 13 00:39:40 navi kernel: [  326.384190] BUG: unable to handle kernel NULL pointer dereference at 000000a4
Jan 13 00:39:40 navi kernel: [  326.384190] IP: [<c136b9d2>] tg3_poll_work+0x4f3/0x906
Jan 13 00:39:40 navi kernel: [  326.384190] *pde = 00000000 
Jan 13 00:39:40 navi kernel: [  326.384190] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
Jan 13 00:39:40 navi kernel: [  326.384190] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb1/1-3/speed
Jan 13 00:39:40 navi kernel: [  326.384190] Modules linked in: sdhci_pci thinkpad_acpi sdhci
Jan 13 00:39:40 navi kernel: [  326.384190] 
Jan 13 00:39:40 navi kernel: [  326.384190] Pid: 734, comm: usb-storage Not tainted 2.6.33-rc3-git4 #2 1866CTO/1866CTO
Jan 13 00:39:40 navi kernel: [  326.384190] EIP: 0060:[<c136b9d2>] EFLAGS: 00010246 CPU: 0
Jan 13 00:39:40 navi kernel: [  326.384190] EIP is at tg3_poll_work+0x4f3/0x906
Jan 13 00:39:40 navi kernel: [  326.384190] EAX: f72bbc50 EBX: f5704380 ECX: f72bbc50 EDX: f48f0f40
Jan 13 00:39:40 navi kernel: [  326.384190] ESI: 00000000 EDI: 00000000 EBP: c175bf8c ESP: c175bf3c
Jan 13 00:39:40 navi kernel: [  326.384190]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Jan 13 00:39:40 navi kernel: [  326.384190] Process usb-storage (pid: 734, ti=c175b000 task=f50223a0 task.ti=f5016000)
Jan 13 00:39:40 navi kernel: [  326.384190] Stack:
Jan 13 00:39:40 navi kernel: [  326.384190]  00000001 00000000 c175bf7c 00000154 00000002 0000003f 00010000 f4a7aa60
Jan 13 00:39:40 navi kernel: [  326.384190] <0> f4a9ea60 f5704380 f48f0f40 00000153 00000042 f5704400 f57047d8 00000000
Jan 13 00:39:40 navi kernel: [  326.384190] <0> 00000153 f5704400 f5704380 f4897000 c175bfac c136be77 00000040 f5704384
Jan 13 00:39:40 navi kernel: [  326.384190] Call Trace:
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c136be77>] ? tg3_poll+0x92/0x18b
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c141dbbd>] ? net_rx_action+0x7d/0x1de
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c102e2ba>] ? __do_softirq+0x0/0x18c
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c102e377>] ? __do_softirq+0xbd/0x18c
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c102e2ba>] ? __do_softirq+0x0/0x18c
Jan 13 00:39:40 navi kernel: [  326.384190]  <IRQ> 
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c102df8f>] ? irq_exit+0x3b/0x77
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c1003dbd>] ? do_IRQ+0x7d/0x90
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c1002cee>] ? common_interrupt+0x2e/0x34
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c152a9c0>] ? _raw_spin_unlock_irq+0x27/0x47
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c104007b>] ? kfifo_from_user+0x3f/0x6c
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c152a9c6>] ? _raw_spin_unlock_irq+0x2d/0x47
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c102299c>] ? T.1280+0x5d/0x94
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c102293f>] ? T.1280+0x0/0x94
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c152857a>] ? schedule+0x410/0x46c
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c1528770>] ? schedule_timeout+0x1c/0x1d9
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c152800e>] ? wait_for_common+0x31/0x100
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c104ceb2>] ? trace_hardirqs_on+0xb/0xd
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c152a9c0>] ? _raw_spin_unlock_irq+0x27/0x47
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c152a9cb>] ? _raw_spin_unlock_irq+0x32/0x47
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c10213bb>] ? sub_preempt_count+0x8b/0x98
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c1528095>] ? wait_for_common+0xb8/0x100
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c1024d98>] ? default_wake_function+0x0/0x12
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c1528115>] ? wait_for_completion_interruptible_timeout+0x12/0x14
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13a8fe3>] ? usb_stor_msg_common+0xee/0x112
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13a95b2>] ? usb_stor_bulk_transfer_buf+0x46/0x73
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13a96ad>] ? usb_stor_Bulk_transport+0xce/0x24e
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13aa737>] ? usb_stor_control_thread+0x0/0x19b
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13a916e>] ? usb_stor_invoke_transport+0x1c/0x298
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13aa78c>] ? usb_stor_control_thread+0x55/0x19b
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13aa78c>] ? usb_stor_control_thread+0x55/0x19b
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c104ceb2>] ? trace_hardirqs_on+0xb/0xd
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c152a9c0>] ? _raw_spin_unlock_irq+0x27/0x47
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c152a9cb>] ? _raw_spin_unlock_irq+0x32/0x47
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13aa737>] ? usb_stor_control_thread+0x0/0x19b
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13aa737>] ? usb_stor_control_thread+0x0/0x19b
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13a8d2f>] ? usb_stor_transparent_scsi_command+0xd/0xf
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13aa852>] ? usb_stor_control_thread+0x11b/0x19b
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c152871c>] ? preempt_schedule+0x35/0x44
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c10209d6>] ? complete+0x39/0x43
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c13aa737>] ? usb_stor_control_thread+0x0/0x19b
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c103f838>] ? kthread+0x63/0x68
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c103f7d5>] ? kthread+0x0/0x68
Jan 13 00:39:40 navi kernel: [  326.384190]  [<c1002cfa>] ? kernel_thread_helper+0x6/0x10
Jan 13 00:39:40 navi kernel: [  326.384190] Code: 7a b2 0a 00 8b 53 58 31 c0 8d 4a 60 85 d2 8b 15 b8 38 76 c1 0f 45 c1 8b 7a 18 85 ff 74 0a 6a 02 8b 4d e0 31 d2 ff d7 58 8b 55 d8 <8b> b6 a4 00 00 00 8b 4d e0 8b 82 a4 00 00 00 89 c7 f3 a4 8b 53 
Jan 13 00:39:40 navi kernel: [  326.384190] EIP: [<c136b9d2>] tg3_poll_work+0x4f3/0x906 SS:ESP 0068:c175bf3c
Jan 13 00:39:40 navi kernel: [  326.384190] CR2: 00000000000000a4
Jan 13 00:39:40 navi kernel: [  326.385188] ---[ end trace aa7144f771ba6580 ]---
Jan 13 00:39:40 navi kernel: [  326.385188] Kernel panic - not syncing: Fatal exception in interrupt
Jan 13 00:39:40 navi kernel: [  326.385188] Pid: 734, comm: usb-storage Tainted: G      D    2.6.33-rc3-git4 #2
Jan 13 00:39:40 navi kernel: [  326.385188] Call Trace:
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1527e54>] ? printk+0x14/0x16
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1527d9f>] panic+0x48/0xe9
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1005026>] oops_end+0x77/0x85
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1019d9f>] no_context+0x114/0x11e
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1019e9b>] __bad_area_nosemaphore+0xf2/0xfa
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1019eb5>] bad_area_nosemaphore+0x12/0x15
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c101a0ed>] do_page_fault+0x128/0x295
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1019fc5>] ? do_page_fault+0x0/0x295
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c152b1d7>] error_code+0x63/0x68
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1019fc5>] ? do_page_fault+0x0/0x295
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c136b9d2>] ? tg3_poll_work+0x4f3/0x906
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c136be77>] tg3_poll+0x92/0x18b
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c141dbbd>] net_rx_action+0x7d/0x1de
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c102e2ba>] ? __do_softirq+0x0/0x18c
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c102e377>] __do_softirq+0xbd/0x18c
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c102e2ba>] ? __do_softirq+0x0/0x18c
Jan 13 00:39:40 navi kernel: [  326.385188]  <IRQ>  [<c102df8f>] ? irq_exit+0x3b/0x77
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1003dbd>] ? do_IRQ+0x7d/0x90
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1002cee>] ? common_interrupt+0x2e/0x34
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c152a9c0>] ? _raw_spin_unlock_irq+0x27/0x47
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c104007b>] ? kfifo_from_user+0x3f/0x6c
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c152a9c6>] ? _raw_spin_unlock_irq+0x2d/0x47
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c102299c>] ? T.1280+0x5d/0x94
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c102293f>] ? T.1280+0x0/0x94
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c152857a>] ? schedule+0x410/0x46c
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1528770>] ? schedule_timeout+0x1c/0x1d9
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c152800e>] ? wait_for_common+0x31/0x100
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c104ceb2>] ? trace_hardirqs_on+0xb/0xd
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c152a9c0>] ? _raw_spin_unlock_irq+0x27/0x47
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c152a9cb>] ? _raw_spin_unlock_irq+0x32/0x47
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c10213bb>] ? sub_preempt_count+0x8b/0x98
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1528095>] ? wait_for_common+0xb8/0x100
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1024d98>] ? default_wake_function+0x0/0x12
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1528115>] ? wait_for_completion_interruptible_timeout+0x12/0x14
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13a8fe3>] ? usb_stor_msg_common+0xee/0x112
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13a95b2>] ? usb_stor_bulk_transfer_buf+0x46/0x73
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13a96ad>] ? usb_stor_Bulk_transport+0xce/0x24e
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13aa737>] ? usb_stor_control_thread+0x0/0x19b
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13a916e>] ? usb_stor_invoke_transport+0x1c/0x298
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13aa78c>] ? usb_stor_control_thread+0x55/0x19b
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13aa78c>] ? usb_stor_control_thread+0x55/0x19b
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c104ceb2>] ? trace_hardirqs_on+0xb/0xd
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c152a9c0>] ? _raw_spin_unlock_irq+0x27/0x47
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c152a9cb>] ? _raw_spin_unlock_irq+0x32/0x47
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13aa737>] ? usb_stor_control_thread+0x0/0x19b
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13aa737>] ? usb_stor_control_thread+0x0/0x19b
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13a8d2f>] ? usb_stor_transparent_scsi_command+0xd/0xf
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13aa852>] ? usb_stor_control_thread+0x11b/0x19b
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c152871c>] ? preempt_schedule+0x35/0x44
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c10209d6>] ? complete+0x39/0x43
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c13aa737>] ? usb_stor_control_thread+0x0/0x19b
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c103f838>] ? kthread+0x63/0x68
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c103f7d5>] ? kthread+0x0/0x68
Jan 13 00:39:40 navi kernel: [  326.385188]  [<c1002cfa>] ? kernel_thread_helper+0x6/0x10
Jan 13 00:39:40 navi kernel: [  326.385188] [drm:drm_fb_helper_panic] *ERROR* panic occurred, switching back to text console

objdump: tg3_poll_work:
->    b2ca:       8b b6 a4 00 00 00       mov    0xa4(%esi),%esi
Dereference of skb (skb->data) in tg3_poll_work, line 4657, inline
function "skb_copy_from_linear_data(skb, copy_skb->data, len);"
    b2d0:       8b 4d e0                mov    -0x20(%ebp),%ecx
    b2d3:       8b 82 a4 00 00 00       mov    0xa4(%edx),%eax
    b2d9:       89 c7                   mov    %eax,%edi
    b2db:       f3 a4                   rep movsb
%ds:(%esi),%es:(%edi)
    b2dd:       8b 53 58                mov    0x58(%ebx),%edx
    b2e0:       8b 75 d8                mov    -0x28(%ebp),%esi
    b2e3:       85 d2                   test   %edx,%edx
    b2e5:       8d 42 60                lea    0x60(%edx),%eax
    b2e8:       8b 15 00 00 00 00       mov    0x0,%edx


git bisect start
# bad: [55639353a0035052d9ea6cfe4dde0ac7fcbb2c9f] Linux 2.6.33-rc1
git bisect bad 55639353a0035052d9ea6cfe4dde0ac7fcbb2c9f
# good: [22763c5cf3690a681551162c15d34d935308c8d7] Linux 2.6.32
git bisect good 22763c5cf3690a681551162c15d34d935308c8d7
# bad: [701791cc3c8fc6dd83f6ec8af7e2541b4a316606] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
git bisect bad 701791cc3c8fc6dd83f6ec8af7e2541b4a316606
# bad: [28b4d5cc17c20786848cdc07b7ea237a309776bb] Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
git bisect bad 28b4d5cc17c20786848cdc07b7ea237a309776bb
# good: [0ab365f463b9c5c8b76476a1808dfde1c38f6f19] bnx2x: version 1.52.1-4
git bisect good 0ab365f463b9c5c8b76476a1808dfde1c38f6f19
# bad: [0ccfe64d3f177a61a071b7a6fa363f0a292158c4] mv643xx: convert to netdev_tx_t
git bisect bad 0ccfe64d3f177a61a071b7a6fa363f0a292158c4
# bad: [0ccfe64d3f177a61a071b7a6fa363f0a292158c4] mv643xx: convert to netdev_tx_t
git bisect bad 0ccfe64d3f177a61a071b7a6fa363f0a292158c4
# bad: [dfef948ed2ba69cf041840b5e860d6b4e16fa0b1] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
git bisect bad dfef948ed2ba69cf041840b5e860d6b4e16fa0b1
# good: [634a555ce3ee5ea1fdcaee8b4ac9ce7b54f301ac] rndis_wlan: handle NL80211_AUTHTYPE_AUTOMATIC
git bisect good 634a555ce3ee5ea1fdcaee8b4ac9ce7b54f301ac
# good: [411da6407e778bf946911df08bb5afc505422f31] tg3: rename rx_[std|jmb]_ptr
git bisect good 411da6407e778bf946911df08bb5afc505422f31
# bad: [b76965e02bfdd4164c00bf946ff6ca1818ed9fcd] act_mirred: optimization.
git bisect bad b76965e02bfdd4164c00bf946ff6ca1818ed9fcd
# bad: [a2bfbc072e279ff81e6b336acff612b9bc2e5281] Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
git bisect bad a2bfbc072e279ff81e6b336acff612b9bc2e5281
# bad: [2d6682db114cb53bc94991659478756302e6a600] bonding: fix 802.3ad standards compliance error
git bisect bad 2d6682db114cb53bc94991659478756302e6a600
# bad: [b196c7e45f30cbcd38c83386bc8a04a21477f8d3] tg3: Add rx prod ring consolidation
git bisect bad b196c7e45f30cbcd38c83386bc8a04a21477f8d3
# bad: [2b2cdb65bec42d38268b2ac115876b066afa7f95] tg3: Lay proucer ring handling groundwork
git bisect bad 2b2cdb65bec42d38268b2ac115876b066afa7f95
# bad: [4361935afe3abc3e5a93006b99197fac1fabbd50] tg3: Consider rx_std_prod_idx a hw mailbox
git bisect bad 4361935afe3abc3e5a93006b99197fac1fabbd50

-- 
Tobias						PGP: http://8ef7ddba.uguu.de
--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux