On Sun, Jan 27, 2013 at 08:07:01PM -0500, Alan Stern wrote: > This is a place where 3.8 has changed from 3.6. Does your problem go > away if you revert the first hunk of your patch and simply set > > ehci->next_frame = (now_frame - 1) & fmask; > > here? Yes, this by itself does resolve the problem. I also tried applying the changes since 3.6.11 that touch ehci-sched.c, and then changing the logic for last_iso_frame to do as above and start the next scan one frame earlier, and that's also working. Thanks. On Sun, Jan 27, 2013 at 08:07:01PM -0500, Alan Stern wrote: > On Sun, 27 Jan 2013, Andy Leiserson wrote: > > > Please CC me on replies, as I'm not on the list. > > > > My system is an AR9132 mips SoC, kernel 3.6.11 with OpenWrt patches. > > The audio device is a C-Media usb audio adapter, IDs 0d8c/000e. It is a > > full speed device, and is connected directly to the only USB port. I'm > > using only the output, but it does have a mic input as well. (There are > > also various controls, e.g. mute, suggested by the USB descriptors, but > > those aren't present on my device.) > > > > The device was working fine with OpenWrt-patched kernel 3.3.8. > > > > With the new kernel, I have the following problems: > > * audio output is choppy (underruns). the usb audio driver > > intermittently prints: > > > > ALSA sound/usb/pcm.c:1187 delay: estimated 624, actual 288 > > > > (I realize the message is not directly indicative of an underrun, but > > I believe the underruns and the message are correlated) > > > > This tends to get worse as playback progresses, and sometimes there > > are "cannot submit urb" messages with error -27, and sometimes > > playback hangs. > > > > * audio programs almost always hang in "D" state when closing the audio > > device. > > > > I investigated the second problem first in hopes I could eliminate the > > reboot after each experiment, but in the end, both problems went away > > simultaneously. > > You should be aware that there have been changes to the periodic > scanning code in ehci-hcd since the kernel you've been using. It might > matter (although I suspect it doesn't). > > > I tracked the hang to the sleep in ehci_endpoint_disable that waits for > > active iso transactions to finish (indicated by stream->td_list being > > empty). > > > > My theory is that the changes in > > f42890782241a60d107f23d08089a4a12b507a11 ("USB: EHCI: simplify > > isochronous scanning") result in leakage of sitds when still-active > > sitds in the previous frame are skipped by scan_isoc. I don't know why > > this results in the underruns, but it is a low-memory embedded system, > > so possibly it's just leaking enough descriptors to run out of > > dma-eligible memory. > > It looks like you are right. Those sitds are supposed to get > deallocated the next time scan_isoc runs, but they probably aren't. > > > The patch below restores the behavior before commit f428 of stopping the > > scan immediately once a still-active descriptor is found (and leaving > > next_frame at the previous frame rather than the current frame if > > appropriate). This patch resolves the problems I've been seeing, but I > > don't know if it's the best fix. > > Stopping the scan is the wrong thing to do, but I think you may be > right about the next_frame value. > > > I have various ftrace captures, which I can post somewhere if there's > > interest. Some of them have the iso stream refcounting restored, and > > clearly show that there aren't as many puts as gets. The detail of > > scan_isoc behavior isn't visible in the traces, though. > > The gets and puts shouldn't matter. As long as the sitd structures get > deallocated when they aren't active any more, everything should work > properly. > > > (An aside: I noticed there is an rmb() before the itd status check, but > > not before the sitd status check.) > > That's because the itd code loops over the members of > q.itd->hw_transaction whereas the sitd code does nothing of the sort. > > > --- a/drivers/usb/host/ehci-sched.c 2013-01-26 19:37:21.000000000 -0800 > > +++ b/drivers/usb/host/ehci-sched.c 2013-01-26 19:39:25.000000000 -0800 > > @@ -2296,7 +2296,7 @@ > > type = Q_NEXT_TYPE(ehci, > > q.sitd->hw_next); > > q = *q_p; > > - break; > > + goto end_scan; > > No, we definitely shouldn't be doing this. > > > } > > > > /* Take finished SITDs out of the schedule > > @@ -2336,5 +2336,6 @@ > > break; > > frame = (frame + 1) & fmask; > > } > > - ehci->next_frame = now_frame; > > +end_scan: > > + ehci->next_frame = frame; > > } > > This is a place where 3.8 has changed from 3.6. Does your problem go > away if you revert the first hunk of your patch and simply set > > ehci->next_frame = (now_frame - 1) & fmask; > > here? > > Alan Stern > > -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html