Em Wed, 08 Jan 2014 15:05:12 -0200 Mauro Carvalho Chehab <m.chehab@xxxxxxxxxxx> escreveu: > Em Wed, 08 Jan 2014 14:31:28 -0200 > Mauro Carvalho Chehab <m.chehab@xxxxxxxxxxx> escreveu: > > > Em Thu, 02 Jan 2014 14:07:22 -0800 > > Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx> escreveu: > > > > > On Sun, Dec 29, 2013 at 02:54:40AM -0200, Mauro Carvalho Chehab wrote: > > > > It seems that usb_unlink_urb() is causing troubles with xHCI: the > > > > endpoint stops streaming, but, after that, it doesn't start again, > > > > and lots of debug messages are produced. I emailed you the full log > > > > after start streaming in priv (too big for vger), but basically, > > > > it produces: > > > > > > > > [ 1635.754546] xhci_hcd 0000:00:14.0: Endpoint 0x81 not halted, refusing to reset. > > > > [ 1635.754562] xhci_hcd 0000:00:14.0: Endpoint 0x82 not halted, refusing to reset. > > > > [ 1635.754577] xhci_hcd 0000:00:14.0: Endpoint 0x83 not halted, refusing to reset. > > > > [ 1635.754594] xhci_hcd 0000:00:14.0: Endpoint 0x84 not halted, refusing to reset. > > > > > > I think that's due to the driver (or userspace) attempting to reset the > > > endpoint when it didn't actually receive a stall (-EPIPE) status from an > > > URB. When that happens, the xHCI host controller endpoint "toggle" bits > > > get out of sync with the device toggle bits, and the result is that all > > > transfers will fail to the endpoint from then on until you switch > > > alternate interface settings or unplug/replug the device. > > > > > > Try this patch: > > > > > > http://marc.info/?l=linux-usb&m=138116117104619&w=2 > > > > > > It's still under RFC, and I know it has race conditions, but it will let > > > you quickly test whether this fixes your issue. > > > > Didn't work fine, or at least it didn't solve all the problems. Also, it > > started to cause OOPSes due to the race conditions. > > > > > > > > This has been a long-standing xHCI driver bug. I asked my OPW intern to > > > work on the patch to fix it, but she may be a bit busy with her new job > > > to finish up the RFC. I'll probably have to take over finishing the > > > patch, if this turns out to be your issue. > > > > > > > (Not sure why it is trying to stop all endpoints - as just one endpoint was > > > > requested to restart). > > > > > > Something is calling into usb_clear_halt() with all the endpoints. > > > Userspace, perhaps? > > > > No, userspace is not doing it. The userspace doesn't even know that this > > device is USB (and were written at the time that all media drivers were > > PCI only - so it doesn't have any USB specific call on it). > > > > > You could add WARN() calls to usb_clear_halt() to > > > see what code is resetting the endpoints. In any case, it's not part of > > > the USB core code to change configuration or alt settings, since I don't > > > see any xHCI driver output from the endpoint bandwidth code in this > > > chunk of the dmesg you sent: > > > > The em28xx-audio.c driver may need to call usb_set_interface() while > > the video is still streaming, in order to unmute the audio. That happens > > when the audio device is opened. > > > > With EHCI, this works properly. > > > > > [ 1649.640783] xhci_hcd 0000:00:14.0: Removing canceled TD starting at 0xb41e8580 (dma). > > > [ 1649.640784] xhci_hcd 0000:00:14.0: TRB to noop at offset 0xb41e8580 > > > [ 1649.643159] xhci_hcd 0000:00:14.0: Endpoint 0x81 not halted, refusing to reset. > > > [ 1649.643188] xhci_hcd 0000:00:14.0: Endpoint 0x82 not halted, refusing to reset. > > > [ 1649.643215] xhci_hcd 0000:00:14.0: Endpoint 0x83 not halted, refusing to reset. > > > [ 1649.643239] xhci_hcd 0000:00:14.0: Endpoint 0x84 not halted, refusing to reset. > > > [ 1649.735539] xhci_hcd 0000:00:14.0: ERROR no room on ep ring, try ring expansion > > > > > > Sarah Sharp > > > > Btw, sometimes, I get such logs: > > > > [ 646.192273] xhci_hcd 0000:00:14.0: Miss service interval error, set skip flag > > [ 646.192292] xhci_hcd 0000:00:14.0: Miss service interval error, set skip flag > > [ 646.192311] xhci_hcd 0000:00:14.0: Miss service interval error, set skip flag > > [ 646.192329] xhci_hcd 0000:00:14.0: Miss service interval error, set skip flag > > [ 646.192351] xhci_hcd 0000:00:14.0: Miss service interval error, set skip flag > > [ 646.192376] xhci_hcd 0000:00:14.0: Miss service interval error, set skip flag > > > > After adding some debug at em28xx-audio, triggering alsa trigger start > > events, I'm getting those: > > > > [ 3078.971224] snd_em28xx_capture_trigger: start capture > > [ 3078.971284] xhci_hcd 0000:00:14.0: ERROR no room on ep ring, try ring expansion > > [ 3078.971311] xhci_hcd 0000:00:14.0: ring expansion succeed, now has 4 segments > > [ 3078.971350] xhci_hcd 0000:00:14.0: ERROR no room on ep ring, try ring expansion > > [ 3078.971387] xhci_hcd 0000:00:14.0: ring expansion succeed, now has 8 segments > > [ 3079.034626] em28xx_audio_isocirq, 64 packets (first one with size 12) > > > > Here, some audio data arrives. > > > > [ 3079.034665] snd_em28xx_capture_trigger: stop capture > > > > It seems, however, that this didn't arrive in time, and causes an alsa > > buffer underrun. So, it cancels the existing URBs. > > > > PS.: Even with EHCI, it causes a few ALSA underruns before it gets steady. > > I suspect that this is due to em28xx time to synchronize audio and video > > streams. > > > > [ 3079.034736] xhci_hcd 0000:00:14.0: Cancel URB ffff880207900000, dev 4, ep 0x83, starting at offset 0x1ffb13850 > > [ 3079.034755] xhci_hcd 0000:00:14.0: // Ding dong! > > [ 3079.034783] xhci_hcd 0000:00:14.0: Stopped on Transfer TRB > > [ 3079.034790] snd_em28xx_capture_trigger: start capture > > > > While xHCI is still canceling the URBs, a new trigger happens, and it > > calls usb_submit_urb(). > > > > [ 3079.034819] xhci_hcd 0000:00:14.0: Removing canceled TD starting at 0x1ffb13850 (dma). > > [ 3079.034835] xhci_hcd 0000:00:14.0: TRB to noop at offset 0x1ffb13850 > > ... > > [ 3079.036341] xhci_hcd 0000:00:14.0: Removing canceled TD starting at 0xb624b850 (dma). > > [ 3079.036352] xhci_hcd 0000:00:14.0: TRB to noop at offset 0xb624b850 > > [ 3079.036365] em28xx_audio_isocirq, 64 packets (first one with size 0) > > > > But xHCI only finishes cancelling the first URB here... > > > > [ 3079.036382] xhci_hcd 0000:00:14.0: Cancel URB ffff880207900800, dev 4, ep 0x83, starting at offset 0x1ff937010 > > ... > > [ 3079.043158] xhci_hcd 0000:00:14.0: TRB to noop at offset 0x1ffb13840 > > [ 3079.043170] em28xx_audio_isocirq, 64 packets (first one with size 0) > > > > And only here, it finishes to cancel the entire operation. > > > > [ 3079.043231] xhci_hcd 0000:00:14.0: ERROR no room on ep ring, try ring expansion > > [ 3079.043299] xhci_hcd 0000:00:14.0: ring expansion succeed, now has 16 segments > > [ 3079.428996] em28xx_audio_isocirq, 64 packets (first one with size 64) > > > > Finally, after ~400ms after the new usb_submit_urb(), the first audio packet > > appears... > > > > [ 3079.429069] snd_em28xx_capture_trigger: stop capture > > > > However, this is not fast enough to avoid ALSA buffer underrun. So, > > the driver cancels the existing URBs... > > > > [ 3079.429204] xhci_hcd 0000:00:14.0: Cancel URB ffff880207900000, dev 4, ep 0x83, starting at offset 0xc5a7f4b0 > > [ 3079.429241] snd_em28xx_capture_trigger: start capture > > > > And submits a new set. > > > > Not sure how to fix it. > > Hmm... calling xawtv with a very high-latency alsa buffer works (the > default latency is 30ms, with works fine with an EHCI port): > > $ xawtv --alsa-latency 500 > > Of course, a half-second latency means that audio and video won't be > properly synchronized, but at least audio works. > > I'll turn off the USB logs and do more experiences with the latency, > in order to have an idea on how faster is EHCI to handle the > ISOC requests, when compared with xHCI. Ok, at lest on my quad-core 3rd gen i7core notebook, a latency of 65ms is enough for audio to work on em28xx with xHCI. However, using such latency on EHCI causes underruns. A latency of 90ms seems to work fine on both drivers. I'm starting to wander that maybe xHCI is not using the same urb->interval than EHCI, and if this could explain those issues. Regards, -- Cheers, Mauro -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html