Hi Takashi, Thanks for taking a look at this. > I'm not sure whether that's the case. Do you mean that one thread > gets stuck at pcm_release_private() which calls snd_pcm_unlink()? > Or do you really use the PCM linkage? We're not explicitly using the link/unlink APIs, so I think it must be pcm_release_private(). I'll try out your suggestion over the next couple of days. In the meantime we've avoided the issue by arranging for the realtime threads to have the same priority (which I think they should have anyway). Rob. At 08:14 on Tue, Oct 02 2018, Takashi wrote: > On Fri, 28 Sep 2018 18:23:24 +0200, > Rob Duncan wrote: >> >> I'm trying to address a bug where we end up with a thread spinning and >> consuming an entire cpu. The issue seems to be this code in >> sound/core/pcm_native.c: >> >> /* Writer in rwsem may block readers even during its waiting in queue, >> * and this may lead to a deadlock when the code path takes read sem >> * twice (e.g. one in snd_pcm_action_nonatomic() and another in >> * snd_pcm_stream_lock()). As a (suboptimal) workaround, let writer to >> * spin until it gets the lock. >> */ >> static inline void down_write_nonblock(struct rw_semaphore *lock) >> { >> while (!down_write_trylock(lock)) >> cond_resched(); >> } >> >> The original commit for this is 67ec1072b053c15564e6090ab30127895dc77a89 >> >> What we're suspecting is that a normal thread (SCHED_OTHER) has a reader >> lock and a real-time thread using SCHED_RR or SCHED_FIFO is trying to >> take the writer lock. If both threads are pinned to the same CPU for >> some reason then the reader thread will never get scheduled (because the >> real-time writer thread is still runnable), and we will never make >> progress. >> >> Does this sound right? What can we do to fix this? > > I'm not sure whether that's the case. Do you mean that one thread > gets stuck at pcm_release_private() which calls snd_pcm_unlink()? > Or do you really use the PCM linkage? > > In the former case, we may loosen it by optimizing like the patch > below (totally untested). I guess it won't be a problem about racy > access, but need double-checks afterward. > > > thanks, > > Takashi > > > --- a/sound/core/pcm_native.c > +++ b/sound/core/pcm_native.c > @@ -2369,7 +2369,8 @@ int snd_pcm_hw_constraints_complete(struct snd_pcm_substream *substream) > > static void pcm_release_private(struct snd_pcm_substream *substream) > { > - snd_pcm_unlink(substream); > + if (snd_pcm_stream_linked(substream)) > + snd_pcm_unlink(substream); > } > > void snd_pcm_release_substream(struct snd_pcm_substream *substream) _______________________________________________ Alsa-devel mailing list Alsa-devel@xxxxxxxxxxxxxxxx http://mailman.alsa-project.org/mailman/listinfo/alsa-devel