Re: Low Latency vs. Real Time Kernel - actual latencies ?

Fernando Lopez-Lezcano <nando@xxxxxxxxxxxxxxxxxx> · Thu, 01 Jan 2015 11:01:05 -0800

On 01/01/2015 09:16 AM, Simon Lewis wrote:
Hallo Fernando

I guess the following has been asked before. however I would just like
to be sure:

Starting with a naked disk drive and installing Fedora 21 from the JAM
spin, and then installing kernel-rt from the planetcore repo,

You should probably "yum install planetccrma-core" as that will make 
sure you have other stuff like rtirq.

would  the
rt-kernel be automatically started with the threadirqs parameter? How to
check this?

Yes, threadirqs is automatically enabled for the rt patched kernels. 
Make sure that the rtirq service is starting on boot (it reorders the 
irqs to put the sound cards on top) - I think that is done automatically 
when you install rtirq.

I think running "ps axuw|grep irq" will show you the kernel irq threads. 
Also, "/usr/bin/rtirq status" should show you the current priority of 
irq threads.

-- Fernando

Am 31.12.2014 um 22:23 schrieb Fernando Lopez-Lezcano:

Sigh:

"This is exactly what the real-time patch is doing: it provides a
mechanism for aggregating the audio tasks, and for attributing them a
higher priority than the other tasks."

No, definitely not, this is NOT what the real-time patch does.

ANY KERNEL CAN DO THIS, you do not need to patch it at all.

As usual - I see this frequently - the writer confuses two different
mechanisms (or layers?) that contribute to audio apps having good
performance for low latency settings:

1) giving user tasks access to SCHED_FIFO and/or SCHED_RR scheduling.
What does this mean? The audio threads in your audio applications will
be able to run in this scheduling ring and will preempt any other
processes in the computer (that is, the audio threads have priority over
everything else). This can be done with /etc/limits.d/* (the current
solution) or cgroups (newer, only available in newer kernels). Both
limits.conf and cgroups can do the same thing - cgroups can also reserve
some CPU for non-audio tasks (could be a bad thing, could be a good
thing, it depends on your goals).

If this is not done you will not get good performance out of audio apps,
period. And an RT patched kernel will not help at all.

2) running a kernel that has good low latency performance. There is a
whole range of options for this. The simplest is to enable full
preemption in a vanilla kernel. What does this do? It tries to minimize
the time the kernel spends in critical sections of code within which
scheduling is forbidden. If you can't schedule an audio task for a
"long" time you will get a click as the sound card is starved of
samples. These options can have a small but probably measurable impact
in overall performance (ie: nothing is free).

A step further is to boot the kernel with the threadirqs parameter _and_
properly optimize the priorities of the IRQ kernel threads (the rtirq
package does that). What does this do? It makes sure that the interrupt
request of the sound card is processed with higher priority than
(almost) all the others. The processing of the IRQ will trigger the
scheduling of the userland task that handles the audio samples, so it is
important to do this as well (and the priorities of the IRQ handling
routines and the userland audio threads - jack, for example - have to be
properly ordered).

Going further you tinker with the kernel itself by patching it so that
more of it can be preempted (the type of kernel I maintain for Planet
CCRMA). This is the RT patch which is maintained separately from the
vanilla kernel. It significantly lowers the time the kernel spends in
critical sections of code that can stop scheduling of tasks. The smaller
that time, the faster an audio task will be scheduled after the sound
card signals the system it has (or needs) samples.

As there are less users actively using the RT patch there are more bugs.
Also, the RT patch has in the past uncovered bugs in the mainline code
that only showed up with the RT patch.

In the past years code from the RT patch has slowly migrated to the
vanilla kernel, so that the maximum latency of a properly configured
vanilla kernel has gone down significantly.

This is all further complicated by hyperthreading (fake cpu cores) and
the newest intel_pstate power budget cpu core speed control driver. You
need to optimize those things as well. For best performance (or even
decent performance) you will have to enable full speed on all cpu cores,
and very likely disable hyperthreading as well. I used to not notice a
difference with hyperthreading, but in my latest hardware I really need
to do that.

And on some hardware (usually laptops) you are dead in the water, the
BIOS can have badly designed MSI(sp?) handlers that tie up the CPU for
milliseconds and screw up everything that Linux can try to do. Nothing
can be done save for complaining to the vendor and upgrading the BIOS.

Anyway, hopefully this was a clear explanation...

I'm smart enough to get what they're doing, but not smart enough to know
if this is what we're already doing in Fedora or how it'll affect other
security concerns.

----
$ grep PREEMPT /boot/config-3.17.4-200.fc20.x86_64
# CONFIG_PREEMPT_RCU is not set
CONFIG_PREEMPT_NOTIFIERS=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
----

So we do not have CONFIG_PREEMPT set. The Fedora kernel is not optimized
for low latency operation. See:

https://rt.wiki.kernel.org/index.php/Frequently_Asked_Questions

for some of the options available (PREEMPT_VOLUNTARY is the lowest
possible optimization).

Also the settings listed
for /etc/security/limits.conf is setting you up for a bad time.

They said they could get less than 1ms with no xruns (except at
application startup) which sounds promising. Certainly if we're shooting
for less than 5 ms instead of less than 1ms.

The statement in that page regarding performance is, well, meaningless.
It does not state what hardware is used. It also says that it gets xruns
"only at application startup" (which application? under which conditions?).

If you are running a properly tuned system and the audio applications
are properly coded - a big if - then you should never[*] get xruns. If
you get them sometimes it means that, well, your system is not useful
for low latency work. The question would be: do you get the same
performance if you _load_ your system? Can you play at 16x2 without
xruns while reading email, browsing the web and copying a file tree with
rsync? Even when all CPU cores are cranking up at 60-70% utilization? If
the answer is yes then you are in business...

-- Fernando

[*] never does not really really mean never, if the load of the computer
is really really high then at some point you will run out of CPU and you
will get an xrun. At that point you need to get a faster computer :-)

_______________________________________________
music mailing list
music@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/music

_______________________________________________
music mailing list
music@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/music