On Mon, 23 Jan 2012, Sarah Sharp wrote: > On Fri, Jan 20, 2012 at 03:16:31PM -0800, Paul Zimmerman wrote: > > This happened while I was testing a device with the g_zero gadget using the > > testusb program on the host. It happened when I tried to run test 10. Without > > the ring expansion patches, test 10 always gave an allocation failure when I > > tried to run it. > > Ok, an easy way to see if it is the dynamic ring expansion patches that > are causing your oops is to revert the patches, then change the number > of ring segments allocated in xhci-mem.c: > > /* > * Isochronous endpoint ring needs bigger size because one isoc URB > * carries multiple packets and it will insert multiple tds to the > * ring. > * This should be replaced with dynamic ring resizing in the future. > */ > if (usb_endpoint_xfer_isoc(&ep->desc)) > virt_dev->eps[ep_index].new_ring = > xhci_ring_alloc(xhci, 8, true, true, mem_flags); > else > virt_dev->eps[ep_index].new_ring = > xhci_ring_alloc(xhci, 1, true, false, mem_flags); > > Change the second parameter to xhci_ring_alloc() to something large > enough that you don't get allocation failures. Say 64 segments? > > > I'm don't know how reproducible this is. I rebooted and ran test 10 again, and > > that time there was no panic. There was a different problem, the test hung, so > > I am looking at that now. > Hmm, is there any way you can run netconsole with xHCI debugging turned > on and capture the oops or hang? Hi Sarah, sorry it took so long to reply, I was busy with other stuff. I set up a serial console, and captured the log below. It looks like the actual problem is list pointer corruption. It seems it happens after the xhci driver expands the ring for EP0. (The dmesg doesn't say which EP it is, but the test case is doing transfers on the control EP.) I then reverted the ring expansion patches, increased the EP0 ring to 2 segments, and the other EPs to 8 segments. All of the test cases I tried worked fine with that. So it seems that the ring expansion patches are the likely suspect. Next I will turn on xhci debugging and capture the failure again. -- Paul [ 45.014423] usbcore: registered new interface driver usbtest [ 135.409000] hub 10-0:1.0: state 7 ports 1 chg 0000 evt 0002 [ 135.415033] hub 10-0:1.0: port 1, status 0203, change 0001, 5.0 Gb/s [ 135.525025] hub 10-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x203 [ 135.635220] usb 10-1: new SuperSpeed USB device number 2 using xhci_hcd [ 135.653591] usb 10-1: skipped 1 descriptor after endpoint [ 135.659431] usb 10-1: skipped 1 descriptor after endpoint [ 135.665456] usb 10-1: skipped 1 descriptor after endpoint [ 135.671291] usb 10-1: skipped 1 descriptor after endpoint [ 135.677237] usb 10-1: default language 0x0409 [ 135.682249] usb 10-1: udev 2, busnum 10, minor = 1153 [ 135.687705] usb 10-1: New USB device found, idVendor=0525, idProduct=a4a0 [ 135.695065] usb 10-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 135.702883] usb 10-1: Product: Gadget Zero [ 135.707328] usb 10-1: Manufacturer: Linux 3.2.0+ with dwc_usb3_pcd [ 135.714012] usb 10-1: SerialNumber: 0123456789.0123456789.0123456789 [ 135.720936] usb 10-1: usb_probe_device [ 135.724999] usb 10-1: configuration #3 chosen from 2 choices [ 135.731178] usb 10-1: Successful Endpoint Configure command [ 135.737461] usb 10-1: adding 10-1:3.0 (config #3, interface 0) [ 135.743925] usbtest 10-1:3.0: usb_probe_interface [ 135.749031] usbtest 10-1:3.0: usb_probe_interface - got id [ 135.755008] usbtest 10-1:3.0: Linux gadget zero [ 135.759916] usbtest 10-1:3.0: super-speed {control in/out bulk-in bulk-out} tests (+alt) [ 135.768702] /git/xhci/drivers/usb/core/inode.c: creating file '002' [ 160.837457] usb 10-1: Successful Endpoint Configure command [ 160.844755] usbtest 10-1:3.0: TEST 20: read odd addr 1024 bytes 1000 times premapped [ 167.042699] usb 10-1: Successful Endpoint Configure command [ 167.049912] usbtest 10-1:3.0: TEST 9: ch9 (subset) control tests, 1000 times [ 173.426732] usb 10-1: Successful Endpoint Configure command [ 173.433935] usbtest 10-1:3.0: TEST 10: queue 32 control calls, 1000 times [ 173.441407] Waiting for dequeue pointer to pass the link TRB [ 173.447941] ring expansion succeed, now has 2 segments [ 173.448688] usbtest 10-1:3.0: subcase 6 completed out of order, last 9 [ 173.454104] usbtest 10-1:3.0: control queue 80.06, err -33, 31989 left, subcase 6, len 0/18 [ 173.454104] ------------[ cut here ]------------ [ 173.454104] WARNING: at /git/xhci/lib/list_debug.c:30 __list_add+0x68/0x80() [ 173.454104] Hardware name: [ 173.454104] list_add corruption. prev->next should be next (ffff88013503bcc0), but was ffff880133 84d980. (prev=ffff88013384dfc0). [ 173.454104] Modules linked in: usbtest xhci_hcd fuse sunrpc ipv6 uinput snd_hda_codec_idt i2c_i80 1 iTCO_wdt snd_hda_intel iTCO_vendor_support serio_raw snd_hda_codec x38_edac e1000e snd_hwdep snd_s eq snd_seq_device snd_pcm snd_timer snd edac_core pcspkr soundcore snd_page_alloc microcode firewire _ohci firewire_core crc_itu_t nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi wmi video [last unloaded: speedstep_lib] [ 173.454104] Pid: 1218, comm: Xorg Not tainted 3.2.0-rc5-sarah+ #4 [ 173.454104] Call Trace: [ 173.454104] <IRQ> [<ffffffff81044f02>] warn_slowpath_common+0x85/0x9d [ 173.454104] [<ffffffff81044fbd>] warn_slowpath_fmt+0x46/0x48 [ 173.454104] [<ffffffff8123b993>] ? vsnprintf+0x83/0x44c [ 173.454104] [<ffffffff81241dd0>] __list_add+0x68/0x80 [ 173.454104] [<ffffffffa02a22b9>] prepare_transfer+0x11f/0x14e [xhci_hcd] [ 173.454104] [<ffffffffa02a3181>] xhci_queue_ctrl_tx+0x84/0x233 [xhci_hcd] [ 173.454104] [<ffffffff8110d367>] ? __kmalloc+0xe9/0xfc [ 173.454104] [<ffffffffa029d4ab>] xhci_urb_enqueue+0x27a/0x3c3 [xhci_hcd] [ 173.454104] [<ffffffff8133df20>] usb_hcd_submit_urb+0x628/0x6e0 [ 173.454104] [<ffffffff812451ee>] ? is_swiotlb_buffer+0x2e/0x3b [ 173.454104] [<ffffffff812454fc>] ? unmap_single+0x27/0x50 [ 173.454104] [<ffffffff8133ef0a>] usb_submit_urb+0x39e/0x3b0 [ 173.454104] [<ffffffffa0145a1a>] ctrl_complete+0x1c8/0x223 [usbtest] [ 173.454104] [<ffffffff8133cbb3>] ? usb_hcd_unmap_urb_for_dma+0x21/0x130 [ 173.454104] [<ffffffff8133cd6e>] usb_hcd_giveback_urb+0x88/0xc0 [ 173.454104] [<ffffffffa02a3b92>] inc_deq+0x230/0x2b9 [xhci_hcd] [ 173.454104] [<ffffffffa02a4610>] finish_td+0xf7/0x1d2 [xhci_hcd] [ 173.454104] [<ffffffffa02a5089>] xhci_irq+0x99e/0xf38 [xhci_hcd] [ 173.454104] [<ffffffff81066f65>] ? groups_free+0x44/0x49 [ 173.454104] [<ffffffffa02a5634>] xhci_msi_irq+0x11/0x15 [xhci_hcd] [ 173.454104] [<ffffffff8109fad6>] handle_irq_event_percpu+0x5f/0x197 [ 173.454104] [<ffffffff8109fc49>] handle_irq_event+0x3b/0x5a [ 173.454104] [<ffffffff8101dc0c>] ? ack_apic_edge+0x27/0x2b [ 173.454104] [<ffffffff810a2531>] handle_edge_irq+0xa9/0xd0 [ 173.454104] [<ffffffff81003bf6>] handle_irq+0x91/0x99 [ 173.454104] [<ffffffff8148244d>] do_IRQ+0x4d/0xa5 [ 173.454104] [<ffffffff8147972e>] common_interrupt+0x6e/0x6e [ 173.454104] <EOI> [<ffffffff81241443>] ? ioread32+0xf/0x31 [ 173.454104] [<ffffffff8147937e>] ? _raw_spin_lock+0xe/0x10 [ 173.454104] [<ffffffffa0093853>] nouveau_fence_update+0x4a/0xd2 [nouveau] [ 173.454104] [<ffffffffa0093b6b>] __nouveau_fence_signalled+0x21/0x29 [nouveau] [ 173.454104] [<ffffffffa0093bc5>] __nouveau_fence_wait+0x52/0xe0 [nouveau] [ 173.454104] [<ffffffffa0075760>] ttm_bo_wait+0xa6/0x160 [ttm] [ 173.454104] [<ffffffffa009360f>] nouveau_bo_vma_del+0x40/0x69 [nouveau] [ 173.454104] [<ffffffffa0094b84>] nouveau_gem_object_close+0x71/0x86 [nouveau] [ 173.454104] [<ffffffffa002af9c>] drm_gem_handle_delete+0x75/0x8b [drm] [ 173.454104] [<ffffffffa002b406>] drm_gem_close_ioctl+0x28/0x2a [drm] [ 173.454104] [<ffffffffa002980e>] drm_ioctl+0x2d6/0x3b3 [drm] [ 173.454104] [<ffffffffa002b3de>] ? drm_gem_destroy+0x43/0x43 [drm] [ 173.454104] [<ffffffff81128535>] do_vfs_ioctl+0x474/0x4b5 [ 173.454104] [<ffffffff8111985a>] ? fsnotify_access+0x62/0x6a [ 173.454104] [<ffffffff811285cc>] sys_ioctl+0x56/0x7a [ 173.454104] [<ffffffff8147a116>] ? do_device_not_available+0xe/0x10 [ 173.454104] [<ffffffff8147fac2>] system_call_fastpath+0x16/0x1b [ 173.454104] ---[ end trace 7a3709086a13e5f2 ]--- [ 173.454104] usbtest 10-1:3.0: subtest 9 error, status -32 [ 173.454104] usbtest 10-1:3.0: subtest 10 error, status -32 [ 173.454104] usbtest 10-1:3.0: subtest 13 error, status -32 [ 173.454104] usbtest 10-1:3.0: subtest 14 error, status -32 [ 173.454104] usbtest 10-1:3.0: subtest 0 error, status -32 [ 173.454104] usbtest 10-1:3.0: subtest 1 error, status -32 [ 173.454104] usbtest 10-1:3.0: subtest 0 error, status -32 [ 173.454104] usbtest 10-1:3.0: subtest 1 error, status -32 [ 173.454104] usbtest 10-1:3.0: subtest 3 error, status -32 [ 173.454104] usbtest 10-1:3.0: subtest 4 error, status -32 [ 173.454104] usbtest 10-1:3.0: subtest 6 error, status -32 [ 173.454104] usbtest 10-1:3.0: subtest 9 error, status -32 -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html