Re: DMA over run on playback

Takashi Iwai <tiwai@xxxxxxx> · Thu, 14 May 2009 12:25:22 +0200

At Wed, 13 May 2009 13:37:03 -0400,
Jon Smirl wrote:
> 
> On Wed, May 13, 2009 at 1:29 PM, Jon Smirl <jonsmirl@xxxxxxxxx> wrote:
> > On Wed, May 13, 2009 at 12:09 PM, Takashi Iwai <tiwai@xxxxxxx> wrote:
> >> At Wed, 13 May 2009 09:38:50 -0400,
> >> Jon Smirl wrote:
> >>>
> >>> On Wed, May 13, 2009 at 9:25 AM, Jaroslav Kysela <perex@xxxxxxxx> wrote:
> >>> > On Wed, 13 May 2009, Jon Smirl wrote:
> >>> >
> >>> >> There's a long thread over on the pulse list about glitch free
> >>> >> playback. The glitches they are encountering are caused by CPU
> >>> >> scheduling latency.  They are trying to fix this by setting HZ up to
> >>> >> 1000 and constantly polling the audio DMA queue to keep it 99% full.
> >>> >>
> >>> >> This doesn't seem like the right solution to me. It is fixing the
> >>> >> symptom not the cause. The cause is 200-300ms scheduling latency. The
> >>> >> source of that needs to be tracked down and fixed in the kernel.  But
> >>> >> we have to live with the latencies until they are fixed.
> >>> >>
> >>> >> The strategy of checking the queue at 1000Hz works but it is very
> >>> >> inefficient. The underlying problem is that the buffer ALSA is using
> >>> >> is too small on systems with 300ms latency.  The buffer is just big
> >>> >> enough to cover 300ms so they rapidly check and fill it at 1000Hz to
> >>> >> ensure that it is full if the 300ms latency strikes.
> >>> >
> >>> > ??? The ring buffer size is not limited if hw allows that.
> >>> >
> >>> >> On my hardware with period interrupts ALSA is only checking the buffer
> >>> >> at 8Hz. Since I'm checking appl_ptr I know when DMA over runs the
> >>> >> buffers. This allows me to insert silence and I could indicated this
> >>> >
> >>> > Inserting silence might be wrong, if you broke stream timing. The elapsed()
> >>> > callback should be called at exact timing (and position should be updated,
> >>> > too).
> >>> >
> >>> >> condition to ALSA if there was a mechanism for doing so. ALSA could
> >>> >> use this over run knowledge to measure scheduling latency and adjust
> >>> >> the buffering.
> >>> >>
> >>> >> But the DMA interface between ALSA and the driver has been fixed at
> >>> >> stream creation time. There's no way to dynamically alter it (like
> >>> >> window size changes in TCP/IP).  With networking you get a list of
> >>> >> buffers to send. As you send these buffers you mark them sent. The
> >>> >> core is free to hand you buffers straight from user space or do copies
> >>> >> and use internal ring buffers. The network driver just gets a list of
> >>> >> physical addresses to send. This buffer bookkeeping could occur in
> >>> >> snd_pcm_period_elapsed().
> >>> >>
> >>> >> A dynamic chaining mechanism allows you to alter the buffering
> >>> >> mid-stream. If the driver indication a DMA over run error this tells
> >>> >> ALSA that it needs to insert another buffer. After a while these
> >>> >> errors will stop and ALSA will have measured worst case CPU scheduling
> >>> >> latency. From then on it will know the exact size of buffering needed
> >>> >> for the kernel it it running on and it can use this knowledge at
> >>> >> stream creation time. Now filling the buffer at 8Hz or lower will work
> >>> >> and you don't have to spend the power associated with 1000Hz timer
> >>> >> interrupts.
> >>> >
> >>> > Nothing prevents to application to allocate a big ring buffer and write
> >>> > samples only as necessary. Application is a producer and controller in this
> >>> > case. The midlevel layer can hardly do something if samples are not
> >>> > available. The situation will be more or less bad.
> >>>
> >>> Who is going to dynamically measure the scheduling latency of the
> >>> kernel and compute the correct buffer size for the low level driver?
> >>> You can't expect every app to do that.
> >>>
> >>> > The whole problem is that standard Linux kernel is not realtime, but audio
> >>> > is realtime task.
> >>>
> >>> By that definition networking is a real time task too, but it's not.
> >>>
> >>> Playing MP3s is not a real-time task. The buffering system between the
> >>> app and ALSA's DMA system is not properly communicating feedback and
> >>> that's what is causing the problem.  Networking has a correct feedback
> >>> look and doesn't get into trouble. ALSA's buffering system isn't
> >>> flexible enough to hide these big scheduling latencies without losing
> >>> data.
> >>
> >> It's not about flexibility.  The current audio system itself is
> >> flexible enough to solve the problem you mentioned.  But you just need
> >> to do everything by yourself.  A car with manual gears is as flexible
> >> as a car with automatic gears from the performance POV, but a driver
> >> needs more work to run it smoothly.
> >>
> >> Also, the automation isn't the best thing.  For example, think about
> >> automatic resizing the buffer and restarting the stream: do you really
> >> want this for the system like JACK?  No...
> >
> > The automation I proposed would only kick in on a kernel with poor
> > latencies. You could write a message into the log saying that buffer
> > sizes were increase due to latency problems.
> >
> > This would also be a clear message to anyone running Jack that their
> > kernel was not performing adequately. On a good, low latency kernel
> > these mechanisms would never trigger.
> 
> BTW, logging messages saying that audio buffer sizes were increased
> due to poor kernel latency would be a good way of providing incentive
> to fix these latency problems. This would make it clear to more people
> that this was the cause of their audio being garbled (instead of some
> other cause).
> 
> You don't want this logic in an app, it needs to be in the kernel.

As Mark suggested, there is a layer between them, alsa-lib (or any
upper-layer libraries) or a sound subsystem.  Only OSS accesses
directly to the kernel.

> You
> want the kernel to remember the longest observed latency. You don't
> want each app to have to rediscover this value.

Well, how would you identify a buffer xrun is caused purely by the
scheduler latency, not an application or operation fault?  For
example, stop a dumb console player app via ctrl-Z.  Now you get a
"buffer underrun".  Similarly, if you have a buggy app, a buffer
xrun can happen quite easily, while other apps run pretty well with
a smaller buffer.  Also, if you use JACK, you don't always want to
raise the buffer size.  Instead, you may want to reduce the task.

So the condition *is* dependent on application (or sound-subsystem).
You can't take simply one value for all.

Of course, I agree that in most of cases apps don't care about the
buffering by themselves.  A more reliable buffer handling in the lower
level is a good and long wanted solution.  If we need to provide in
ALSA itself, as mentioned, I guess this can fit easier into the
alsa-lib layer instead of in the kernel.  The kernel basically doesn't
know better than the user-space regarding this kind of thing.

thanks,

Takashi
_______________________________________________
Alsa-devel mailing list
Alsa-devel@xxxxxxxxxxxxxxxx
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel