SCHED_OTHER > SCHED_FIFO?

Sebastian Kuzminsky <seb@xxxxxxxxxxx> · Wed, 12 Oct 2022 15:14:51 -0600

Hi folks, I'm seeing a behavior I don't understand, that I'm hoping you 
can help me with.

The setup is:
    Raspberry Pi 4 (4x ARMv8 Cortex-A72)
    Debian Bookworm arm64 (https://raspi.debian.net/tested-images/)
    Linux 5.19.14 with 5.19-rt10 patches (config in git repo below)
    cmdline:
	irqaffinity=0-1
	rcu_nocbs=2-3
	rcu_nocb_poll
	nohz_full=2-3
	isolcpus=nohz,domain,managed_irq,2-3

On this system I run a realtime thread that does all the 
latency-reducing tricks I know of:
    kernel command line as specified above
    cpufreq scaling governor = performance
    mlockall()
    set cpu affinity to an isolcpu processor
    run in userspace continuously, never sleep or yield the cpu

The surprising behavior is that if I set the realtime thread to use 
scheduling policy SCHED_FIFO, and set its priority to the max, i get 
terrible behavior - lots of preemption, and lots of idle time on the 
CPU.  On the other hand, if I leave the realtime thread with scheduling 
policy SCHED_OTHER, and the default priority of 0, then the system 
performs great - hardly any preemption.

The test program is here:
    https://github.com/SebKuzminsky/preempt-rt-latency-test

That program does the process-wide realtime setup, then runs a realtime 
thread with SCHED_OTHER (which performs well), joins that thread and 
instead runs a second thread with SCHED_FIFO (which performs poorly). 
It doesn't matter which order I run the threads in, SCHED_FIFO first 
still performs poorly.  Both threads run the same function:
    busywait 1 ms (using the hardware cycle counter for timing)
    check the cycle count
    repeat 10k times
    return

So it doesn't do anything useful, it just looks for latency.

The results look like this:

using PTHREAD_INHERIT_SCHED to keep SCHED_OTHER (with default priority)
scheduling policy: SCHED_OTHER
scheduling parameter priority: 0 (min=0, max=0)
cpu affinity: 3
after 10000 iterations:
    min=54001 cycles (1000.019 us, 1.000 ms)
    avg=54002.001 cycles (1000.037 us, 1.000 ms)
    max=54009 cycles (1000.167 us, 1.000 ms)
OK: worst latency < 2 ms

using PTHREAD_EXPLICT_SCHED to set SCHED_FIFO (with highest priority)
scheduling policy: SCHED_FIFO
scheduling parameter priority: 99 (min=1, max=99)
cpu affinity: 3
after 10000 iterations:
    min=54001 cycles (1000.019 us, 1.000 ms)
    avg=78900.171 cycles (1461.114 us, 1.461 ms)
    max=53188533 cycles (984972.833 us, 984.973 ms)
ERROR: worst latency > 2 ms

The results with SCHED_OTHER are great, but the results with SCHED_FIFO 
are terrible.  This is surprising to me!  I'd expect a thread using 
SCHED_FIFO with max priority to behave at least as well as a thread 
using SCHED_OTHER with the default priority.  Am I misunderstanding 
something here, or is this a bug?

I'm happy to run any experiments people suggest.

--
Sebastian Kuzminsky