Re: Continuous stream of small bulk transfers hangs on OHCI-based systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Reports like this should always be sent to the mailing list as well as 
to me.  And please instruct your mail client to wrap lines after 72 
columns or so -- long lines are hard to deal with.

On Fri, 1 Feb 2013, Mark Ferrell wrote:

> Hello Alan,
> 
> I have been attempting to debug a hang with the OHCI layer when I ran across your August conversation with Tomas Sokorai regarding similar behaviour, though the behaviour did not exist for us in 2.6.27 on the same hardware.

What version of the kernel are you using?

> The hardware in question is an AMD Geode LX900 with 3 FTDI Dual-Port USB Serial interfaces attached to the OHCI bus.  We have an application which manages the serial ports and at some point the ports stop responding.  If we attempt to close() the ports the application will hang and attempting to kill -9 the application triggers a whole slew of issues within the kernel.  For example,  since the USB device in question is a tty device the 'ps' goes into D state while attempting to display the Applications process information.  I presume their is secondary contention in tty_lock() at this point.

The problem probably is connected with the OHCI controller and not with 
the FTDI devices.  Once the problem occurs, it is too late to do any 
debugging.

> Like Tomas we noticed that the OHCI interface "feels" very slow.  We also noticed a physical performance drop in the system as a whole when disabling the EHCI interface on an Intel E660/620 and enabling the OHCI to run comparison tests against the Geode to see if the Intel would also lock-up.
> 
> Unfortunately I do not have a reliable way to reproduce the bug outside of running test-cases on multiple systems waiting for the ports to hang.  Any advice/pointers/insight would be appreciated.

You should build a test kernel with CONFIG_USB_DEBUG enabled.  When the 
problem occurs, post the output from the "dmesg" command -- not the 
contents of the system log file.

>     Feb  1 18:15:40 A303152068 kernel: [ 8404.370974] INFO: task java:3677 blocked for more than 120 seconds.
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.378724] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.381100] java            D c103a90b     0  3677      1 0x00000004
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.398733]  dd72fe54 00000096 dd72fe20 c103a90b c1444530 dd512d80 de2690c0 dd6e0100
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.435612]  00000046 dd512d80 c103a90b dd72fe2c 00000046 c1444520 00000001 c1444520
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.476415]  00000246 dd72fe2c c1052801 dd72fe40 00000246 dd72fe70 dd72fe70 c1444520
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.513915] Call Trace:
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.515689]  [<c103a90b>] ? prepare_to_wait+0x4f/0x56
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.521019]  [<c103a90b>] ? prepare_to_wait+0x4f/0x56
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.526174]  [<c1052801>] ? trace_hardirqs_on+0xb/0xd
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.531238]  [<c103a90b>] ? prepare_to_wait+0x4f/0x56
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.536154]  [<c129677d>] schedule+0x3b/0x3d
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.540189]  [<c11c5421>] usb_kill_urb+0x86/0x9f
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.544126]  [<c103a741>] ? wake_up_bit+0x1b/0x1b
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.548990]  [<c11d4250>] generic_cleanup+0x98/0xa4
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.552549]  [<c11d42d5>] usb_serial_generic_close+0x35/0x3a
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.556484]  [<df11dc00>] ftdi_close+0x3b/0x54 [ftdi_sio]
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.561561]  [<c11d2133>] serial_down+0x23/0x25
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.564128]  [<c1177e76>] tty_port_shutdown+0x3c/0x4a
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.568926]  [<c11785d2>] tty_port_close+0x19/0x37
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.574519]  [<c11d297e>] serial_close+0x44/0x4c
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.579157]  [<c117265e>] tty_release+0x98/0x2e9
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.586850]  [<c109b7ab>] __fput+0xdc/0x17a
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.589187]  [<c109b954>] fput+0x17/0x19
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.592384]  [<c1098d7f>] filp_close+0x57/0x61
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.598120]  [<c1099e69>] sys_close+0x6d/0xa8
>     Feb  1 18:15:41 A303152068 kernel: [ 8404.603033]  [<c12976e5>] syscall_call+0x7/0xb
>     Feb  1 18:15:41 A303152068 kernel: [ 8404.608536]  [<c1290000>] ?  quirk_triton+0x7/0x25

This is useless.  We need to see what happens when the problem occurs, 
not 2 minutes later.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux