Search Linux Wireless

Re: wireless-testing, p54 and sinus 154 data no longer works

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christian Lamparter wrote:
> On Monday 19 January 2009 19:15:09 Artur Skawina wrote:
>> Christian Lamparter wrote:
>>> On Monday 19 January 2009 00:27:02 Artur Skawina wrote:
>>>> Artur Skawina wrote:
>>>>> didn't trigger anything here, just the usual:
>>>>>
>>>>> BUG kmalloc-4096: Poison overwritten
>>>> This is almost 100% reproducible; sometimes the machine freezes instead.
>> Here's an interesting sequence:
>>
>> 1) a TX urb is submitted.
>> 2) p54u_rx_cb() => p54_rx_frame_sent(), which does kfree_skb( the_skb_in_(1) ).
>> 3) p54u_tx_cb() for (1) is called with the same, now freed, skb. kaboom.
>>
>> IOW the skb is freed before the usb completion runs.
> 
> Well, the sequence should be:
> 
> 1) p54_tx gets called
> 1.1) one IRQ urb is submitted
> 1.2) one BULK urb is submitted
> 2) the firmware acks that it got the urbs
> 2.1) p54u_tx_cb is called for the IRQ urb. which frees the small buffer
> 2.2) p54u_tx_cb is called for the BULK urb. which only removes the net2280_tx_hdr from the skb.
> [time passes]
> 3) firmware is finished sending.
> 3.1) p54u_rx_cb gets called
>        => p54_rx_frame_sent passed the feedback to mac80211

That's what one would expect, and is probably why i couldn't see anything
wrong in the code despite going over it several times. Until i got a crash
which left no doubt as to what happened, and made me notice the "wrong"
completion order, log attached [1].
In theory, theory and practice do not differ, in practice...

>> Somehow i don't think this is the reason for the corruption, but it certainly
>> seems to be responsible for some, if not all, of the crashes/panics.
> dunno... we should see a bit more fallout, because skb_pull changes skb->data and skb->len. 

Doing an skb_pull in p54u_tx_cb on skbs that have already been given to mac80211
cannot be good.
We can move the FREE_AFTER_TX(skb) check from the completion to the submission
path, right? Then find a way to do the pull _before_ giving away the skbs.
I can't shutdown the machine where i can reproduce this today, so it will have
to wait until at least tomorrow.

artur

[1]: note there are two issues, first the skb_pull triggers a BUG, which
 is then interrupted by the SLUB error, then the rest of the first BUG
 follows. i've added '#'-comments explaining some of the numbers.

[...]
TX: LEN=   20 SKB=d7eb2b60 HEAD=d78f28c0
TX: DONE 
TX: LEN=   32 SKB=da7c0620 HEAD=de02b3b0
TX: DONE 
TX: LEN=  436 SKB=d7eb2ee0 HEAD=dd4ee180
TXCB: LEN=   36 SKB=d7eb2b60 HEAD=d78f28c0
TXCB: LEN=   48 SKB=da7c0620 HEAD=de02b3b0
TX: DONE 
TX: LEN=  436 SKB=dd55e2a0 HEAD=dd4ef670
TX: DONE 
TX: LEN=  176 SKB=de01e8c0 HEAD=d9812060
TX: DONE 
TXCB: LEN=  452 SKB=d7eb2ee0 HEAD=dd4ee180
TXCB: LEN=  452 SKB=dd55e2a0 HEAD=dd4ef670
TXCB: LEN=  192 SKB=de01e8c0 HEAD=d9812060
RXCB: LEN=   36 SKB=dd5dfee0 HEAD=d9813090
RXCB: LEN=   36 SKB=dd433700 HEAD=d9811030
RXCB: LEN=   36 SKB=dfbf3a80 HEAD=d9816120
TX: LEN=  160 SKB=dfb2da80 HEAD=d9889030        # (1) urb submission, SKB=dfb2da80
TX: DONE 
TX: LEN=  168 SKB=ce1388c0 HEAD=d9888000
RXCB: LEN=   36 SKB=ce138d20 HEAD=d98140c0      # (2) this eventually calls p54_rx_frame_sent() and skb_kfree on SKB=dfb2da80
TXCB: LEN=1802201963 SKB=dfb2da80 HEAD=6b6b6b6b # (3) p54u_tx_cb() was called with the now freed and poisoned SKB=dfb2da80
------------[ cut here ]------------
kernel BUG at include/linux/skbuff.h:1073!      # BUG() in skb_pull(), see stacktrace below, after the SLUB error
invalid opcode: 0000 [#1] DEBUG_PAGEALLOC
last sysfs file: /sys/devices/pci0000:00/0000:00:07.2/usb1/1-1/1-1.1/uevent
Modules linked in: p54usb p54common<3>=============================================================================
BUG skbuff_head_cache: Poison overwritten
-----------------------------------------------------------------------------

INFO: 0xdfb2dad0-0xdfb2dad0. First byte 0x5b instead of 0x6b            # 0x10 is the amount that was pulled (net2280 rx header)
INFO: Allocated in __alloc_skb+0x34/0x110 age=28 cpu=0 pid=0
INFO: Freed in p54_rx+0x760/0x770 [p54common] age=0 cpu=0 pid=0         # p54_rx+0x760/0x770 is the skb_free in p54_rx_frame_sent() (gcc decided to inline it)
INFO: Slab 0xc13f65a0 objects=18 used=16 fp=0xdfb2da80 flags=0x400000c2
INFO: Object 0xdfb2da80 @offset=2688 fp=0xdfb2dee0                      # the freed skb

Bytes b4 0xdfb2da70:  00 00 00 00 0f 23 00 00 ba e7 1b 02 5a 5a 5a 5a .....#..ºç..ZZZZ
  Object 0xdfb2da80:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xdfb2da90:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xdfb2daa0:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xdfb2dab0:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xdfb2dac0:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xdfb2dad0:  5b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [kkkkkkkkkkkkkkk
  Object 0xdfb2dae0:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xdfb2daf0:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xdfb2db00:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xdfb2db10:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xdfb2db20:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xdfb2db30:  6b 6b 6b a5                                     kkk¥            
 Redzone 0xdfb2db34:  bb bb bb bb                                     »»»»            
 Padding 0xdfb2db5c:  5a 5a 5a 5a                                     ZZZZ            
Pid: 0, comm: swapper Tainted: G        W  2.6.29-rc1-00222-g27956bf-dirty #69
Call Trace:
 [<c016cf1e>] check_bytes_and_report+0xce/0xf0
 [<c016d177>] check_object+0x1e7/0x230
 [<c016e915>] __slab_alloc+0x245/0x4f0
 [<c03afe04>] __alloc_skb+0x34/0x110
 [<c016f0eb>] kmem_cache_alloc+0xbb/0xc0
 [<c03afe04>] __alloc_skb+0x34/0x110
 [<c03afe04>] __alloc_skb+0x34/0x110
 [<c03c5695>] find_skb+0x35/0x90
 [<c03c58ae>] netpoll_send_udp+0x2e/0x200
 [<e31371ad>] write_msg+0x9d/0xe0 [netconsole]
 [<e3137110>] write_msg+0x0/0xe0 [netconsole]
 [<c0122533>] __call_console_drivers+0x43/0x50
 [<c01228f7>] release_console_sem+0x87/0x1c0
 [<c0122ec7>] vprintk+0x227/0x2d0
 [<e3137110>] write_msg+0x0/0xe0 [netconsole]
 [<c04dcb00>] printk+0x17/0x1f
 [<c0142ea4>] print_modules+0x74/0xb0
 [<c01051ba>] show_registers+0x1a/0x200
 [<c0138647>] atomic_notifier_call_chain+0x17/0x20
 [<c0138740>] notify_die+0x30/0x40
 [<c0105df6>] __die+0xa6/0x100
 [<c0106070>] die+0x40/0x80
 [<c0103f60>] do_invalid_op+0x0/0xa0
 [<c0103fe1>] do_invalid_op+0x81/0xa0
 [<c03aced8>] skb_pull+0x18/0x40
 [<e3137110>] write_msg+0x0/0xe0 [netconsole]
 [<c0122533>] __call_console_drivers+0x43/0x50
 [<c013ad72>] update_wall_time+0x492/0x8e0
 [<c04de722>] error_code+0x6a/0x70
 [<c016007b>] sys_mincore+0x26b/0x350
 [<c03aced8>] skb_pull+0x18/0x40
 [<e1f1d815>] p54u_tx_cb+0x95/0x120 [p54usb]
 [<c0341489>] usb_hcd_giveback_urb+0x49/0xe0
 [<c034fe4f>] uhci_giveback_urb+0x5f/0x1f0
 [<c034fe6f>] uhci_giveback_urb+0x7f/0x1f0
 [<c035043f>] uhci_scan_schedule+0x33f/0xa40
 [<c035278d>] uhci_irq+0x7d/0x140
 [<c0341125>] usb_hcd_irq+0x25/0x60
 [<c0146de8>] handle_IRQ_event+0x28/0x50
 [<c0148000>] handle_level_irq+0x0/0xb0
 [<c014804b>] handle_level_irq+0x4b/0xb0
 <IRQ>  [<c0103707>] common_interrupt+0x27/0x2c
 [<c0122dbc>] vprintk+0x11c/0x2d0
 [<c034301f>] usb_alloc_urb+0xf/0x30
 [<c034301f>] usb_alloc_urb+0xf/0x30
 [<c04dcb00>] printk+0x17/0x1f
 [<e1f1d212>] p54u_tx_net2280+0x52/0x320 [p54usb]
 [<e1d4845d>] p54_tx+0x51d/0x820 [p54common]
 [<c04ca789>] __ieee80211_tx+0x49/0x130
 [<c04cb61d>] ieee80211_master_start_xmit+0x28d/0x610
 [<c0137401>] __run_hrtimer+0x31/0xf0
 [<c013e894>] tick_program_event+0x14/0x20
 [<c013b311>] getnstimeofday+0x51/0x120
 [<c03b702b>] dev_hard_start_xmit+0x1db/0x240
 [<c03c883b>] __qdisc_run+0x1ab/0x200
 [<c03b4dd1>] net_tx_action+0x81/0xb0
 [<c0126f07>] __do_softirq+0x57/0xf0
 [<c0126eb0>] __do_softirq+0x0/0xf0
 <IRQ>  [<c0137c38>] ktime_get+0x18/0x40
 [<c0148000>] handle_level_irq+0x0/0xb0
 [<c0126c65>] irq_exit+0x55/0x70
 [<c0104f1b>] do_IRQ+0x9b/0x120
 [<c0103707>] common_interrupt+0x27/0x2c
 [<c0108392>] default_idle+0x42/0x50
 [<c0102139>] cpu_idle+0x39/0x80
FIX skbuff_head_cache: Restoring 0xdfb2dad0-0xdfb2dad0=0x6b

FIX skbuff_head_cache: Marking all objects used
 netconsole budget saa7134_empress saa6752hs saa7134 serio_raw budget_ci videobuf_dma_sg budget_core saa7146 ttpci_eeprom videobuf_core tveeprom ir_common [last unloaded: p54common]

Pid: 0, comm: swapper Tainted: G        W  (2.6.29-rc1-00222-g27956bf-dirty #69)    
EIP: 0060:[<c03aced8>] EFLAGS: 00010087 CPU: 0
EIP is at skb_pull+0x18/0x40
EAX: 6b6b6b5b EBX: 6b6b6b6b ECX: dfb2da80 EDX: 00000010
ESI: dfb2da80 EDI: d744bdd0 EBP: d744b1f0 ESP: c0690ef0
 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process swapper (pid: 0, ti=c0690000 task=c05fc300 task.ti=c0646000)
Stack:
 d980a370 e1f1d815 e1f1f440 6b6b6b6b dfb2da80 6b6b6b6b d744bf74 00000086
 d980a370 00000000 df9fb180 d980a370 c0341489 c034fe4f 00000046 df965de0
 df9fb180 df994900 dfa46a68 c034fe6f df9948bc df99491c 00a464d8 df9fb24c
Call Trace:
 [<e1f1d815>] p54u_tx_cb+0x95/0x120 [p54usb]                               # this is the skb_pull in p54u_tx_cb
 [<c0341489>] usb_hcd_giveback_urb+0x49/0xe0
 [<c034fe4f>] uhci_giveback_urb+0x5f/0x1f0
 [<c034fe6f>] uhci_giveback_urb+0x7f/0x1f0
 [<c035043f>] uhci_scan_schedule+0x33f/0xa40
 [<c035278d>] uhci_irq+0x7d/0x140
 [<c0341125>] usb_hcd_irq+0x25/0x60
 [<c0146de8>] handle_IRQ_event+0x28/0x50
 [<c0148000>] handle_level_irq+0x0/0xb0
 [<c014804b>] handle_level_irq+0x4b/0xb0
 <IRQ> <0> [<c0103707>] common_interrupt+0x27/0x2c
 [<c0122dbc>] vprintk+0x11c/0x2d0
 [<c034301f>] usb_alloc_urb+0xf/0x30
 [<c034301f>] usb_alloc_urb+0xf/0x30
 [<c04dcb00>] printk+0x17/0x1f
 [<e1f1d212>] p54u_tx_net2280+0x52/0x320 [p54usb]
 [<e1d4845d>] p54_tx+0x51d/0x820 [p54common]
 [<c04ca789>] __ieee80211_tx+0x49/0x130
 [<c04cb61d>] ieee80211_master_start_xmit+0x28d/0x610
 [<c0137401>] __run_hrtimer+0x31/0xf0
 [<c013e894>] tick_program_event+0x14/0x20
 [<c013b311>] getnstimeofday+0x51/0x120
 [<c03b702b>] dev_hard_start_xmit+0x1db/0x240
 [<c03c883b>] __qdisc_run+0x1ab/0x200
 [<c03b4dd1>] net_tx_action+0x81/0xb0
 [<c0126f07>] __do_softirq+0x57/0xf0
 [<c0126eb0>] __do_softirq+0x0/0xf0
 <IRQ> <0> [<c0137c38>] ktime_get+0x18/0x40
 [<c0148000>] handle_level_irq+0x0/0xb0
 [<c0126c65>] irq_exit+0x55/0x70
 [<c0104f1b>] do_IRQ+0x9b/0x120
 [<c0103707>] common_interrupt+0x27/0x2c
 [<c0108392>] default_idle+0x42/0x50
 [<c0102139>] cpu_idle+0x39/0x80
Code: 83 c4 0c c3 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 53 89 c1 8b 58 50 31 c0 39 d3 72 24 89 d8 29 d0 3b 41 54 89 41 50 73 08 <0f> 0b eb fe 8d 74 26 00 8b 99 a8 00 00 00 89 d0 01 d8 89 81 a8 
EIP: [<c03aced8>] skb_pull+0x18/0x40 SS:ESP 0068:c0690ef0
---[ end trace bba3fc5c928ed0e0 ]---
Kernel panic - not syncing: Fatal exception in interrupt



--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]
  Powered by Linux