On Tue, 22 Nov 2022 15:19:13 +0100, Takashi Iwai wrote: > > On Tue, 22 Nov 2022 12:16:47 +0100, > Carl Hetherington wrote: > > > > Hi Takashi, > > > > Thank you for getting back to me! > > > > On Tue, 22 Nov 2022, Takashi Iwai wrote: > > > > [snip] > > > > > > Now, snd_usb_endpoint_start() is called on 0x2 and that is fine. Next, > > > > snd_usb_endpoint_start() is called on 0x82 and that fails because its > > > > state is still STOPPING. > > > > > > > > At this point things seem broken. > > > > > > > > Does anyone have a hint about where in this sequence things are going > > > > wrong, and maybe even why? > > > > > > The problem is that it's treating XRUNs on the both streams > > > individually. It's correct to recover only the PCM stream when an > > > XRUN is reported to the PCM stream. However, for an XRUN on the > > > capture stream that serves as a sync source, it should stop and > > > recover not only the capture PCM stream but also the playback stream > > > as a sync sink as well. > > > > > > Below is a possible test fix (totally untested!). > > > This may give XRUNs twice eventually, which is a bit confusing, but it > > > aligns with the actual hardware behavior, at least. > > > > [snip fix] > > > > Makes sense, thank you! Sadly, the fix doesn't seem to work because (I > > think) the xruns I'm seeing come via a different path (not though > > notify_xrun()). Mine arrive via this trace: > > > > __snd_pcm_xrun > > snd_pcm_update_state > > snd_pcm_update_hw_ptr > > usb_hcd_giveback_urb > > snd_pcm_period_elapsed_under_stream_lock > > snd_pcm_period_elapsed > > retire_capture_urb > > snd_complete_urb > > > > I'll see if can apply a similar fix to this case, though to my naive > > eyes it looks a little trickier as the xrun is found in the snd_pcm > > code rather than the USB code. Any suggestions most welcome! > > OK, then it's a bit different problem, and not so trivial to fix in > the kernel side alone, I'm afraid. Basically it's a race between > start and stop of two streams. The key point is that, for stopping a > (USB) stream, a sync-stop operation is needed, and this can't be > performed at the PCM trigger itself (which is an atomic operation). > So, the kernel trigger may at most return an error there. > > I assume that it's from snd_usb_endpoint_start() and it returning > -EPIPE error. If so, we may change the PCM core code to set the PCM > state again XRUN in such an error case, so that application may repeat > the standard recovery process. Something like below. Also, it might be slightly better if we swap the starting order of two streams: sync at first, then data. A race can still happen, though. Takashi -- 8< -- --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -241,19 +241,19 @@ static int start_endpoints(struct snd_usb_substream *subs) if (!subs->data_endpoint) return -EINVAL; - if (!test_and_set_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags)) { - err = snd_usb_endpoint_start(subs->data_endpoint); + if (subs->sync_endpoint && + !test_and_set_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags)) { + err = snd_usb_endpoint_start(subs->sync_endpoint); if (err < 0) { - clear_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags); + clear_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags); goto error; } } - if (subs->sync_endpoint && - !test_and_set_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags)) { - err = snd_usb_endpoint_start(subs->sync_endpoint); + if (!test_and_set_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags)) { + err = snd_usb_endpoint_start(subs->data_endpoint); if (err < 0) { - clear_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags); + clear_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags); goto error; } }