late ktimer events due to C-state transitions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We observe late ktimer events due to C-state transitions from CPU Idle
mode.  The test results come from a 5.15-rt kernel with two
timers-related patches backported:

    "softirq: Use a dedicated thread for timer wakeups." from Sebastian via
     https://lore.kernel.org/linux-rt-users/20211201173316.rm6o67qapqsx2n47@xxxxxxxxxxxxx/

and

     "tick: Fix timer storm since introduction of timersd" from Frederic via
     https://lore.kernel.org/linux-rt-users/20220405010752.1347437-2-frederic@xxxxxxxxxx/

The kernel is configured with NO_HZ_COMMON=y and NO_HZ_IDLE=y.

Expected sequence of events:
@ 0ms: userspace enqueues a timer event to fire in 10ms from now
@ 10ms: ktimer event fires and triggers the online_periodic callback
@ 10ms: userspace enqueues a timer event to fire in 10ms from now
@ 20ms: ktimer event fires and triggers the online_periodic callback
...

What we’re actually seeing via kprobes:
@ 0ms: userspace enqueues a timer event to fire in 10ms from now
@ 10ms: ktimer event fires and triggers the online_periodic callback
@ 10ms: userspace enqueues a timer event to fire in 10ms from now
@ 15ms: CPU enters a deep idle state and sleeps for a long time
(20-100+)ms, let's say 70ms for this example
@ 80ms: ktimer event fires and triggers the online_periodic callback
@ 80ms: userspace enqueues a timer event to fire in 10ms from now

The result is that the second timer callback runs after 75ms instead
of the expected 10ms.


When the timers work well, the hrtimer_interrupt causes
raise_hrtimer_softirq() to run, as expected and __do_softirq() runs:
<userspace>-13534   [007] 16947.039513: enqueue_hrtimer
          <idle>-0       [007] 16947.039555:
hrtimer_next_event_without: 16947.037643167
          <idle>-0       [007] 16947.039562: enqueue_hrtimer
          <idle>-0       [007] 16947.049453: hrtimer_interrupt
          <idle>-0       [007] 16947.049457: raise_hrtimer_softirq.
          <idle>-0       [007] 16947.049474: enqueue_hrtimer
       ktimers/7-80      [007] 16947.049483: __do_softirq
       ktimers/7-80      [007] 16947.049485: hrtimer_run_softirq.
 <userspace>-13534   [007] 16947.049523: enqueue_hrtimer
          <idle>-0       [007] 16947.049567:
hrtimer_next_event_without: 16947.047643167
          <idle>-0       [007] 16947.049573: enqueue_hrtimer
          <idle>-0       [007] 16947.049586: enqueue_hrtimer
          <idle>-0       [007] 16947.049594:
hrtimer_next_event_without: 16947.047643167
          <idle>-0       [007] 16947.049598: enqueue_hrtimer
          <idle>-0       [007] 16947.059418: hrtimer_interrupt
          <idle>-0       [007] 16947.059423: raise_hrtimer_softirq.
          <idle>-0       [007] 16947.059441: enqueue_hrtimer
       ktimers/7-80      [007] 16947.059449: __do_softirq
       ktimers/7-80      [007] 16947.059451: hrtimer_run_softirq.
 <userspace>-13534   [007] 16947.059490: enqueue_hrtimer
          <idle>-0       [007] 16947.059529:
hrtimer_next_event_without: 16947.057643167
          <idle>-0       [007] 16947.059531: enqueue_hrtimer
          <idle>-0       [007] 16947.069434: hrtimer_interrupt

When the timers are delayed, the trouble appears to begin when the
hrtimer_interrupt results in execution of hrtimer_run_queues() instead
of raise_hrtimer_softirq():
 <userspace>-13534   [007] 16947.069504: enqueue_hrtimer
          <idle>-0       [007] 16947.069547:
hrtimer_next_event_without: 16947.067643167
          <idle>-0       [007] 16947.069553: enqueue_hrtimer
          <idle>-0       [007] 16947.069567: enqueue_hrtimer
          <idle>-0       [007] 16947.069575:
hrtimer_next_event_without: 16947.067643167
          <idle>-0       [007] 16947.069579: enqueue_hrtimer
          <idle>-0       [007] 16947.078270: hrtimer_interrupt
          <idle>-0       [007] 16947.078278: hrtimer_run_queues.
          <idle>-0       [007] 16947.078300: enqueue_hrtimer
       ktimers/7-80      [007] 16947.078308: __do_softirq
     ksoftirqd/7-81      [007] 16947.078338: ksoftirqd_should_run 0
          <idle>-0       [007] 16947.078361:
hrtimer_next_event_without: 16947.067643167
          <idle>-0       [007] 16947.079323: hrtimer_interrupt
          <idle>-0       [007] 16947.079328: hrtimer_run_queues.
          <idle>-0       [007] 16947.079334: enqueue_hrtimer
     ksoftirqd/7-81      [007] 16947.079359: ksoftirqd_should_run 128
     ksoftirqd/7-81      [007] 16947.079360: __do_softirq
     ksoftirqd/7-81      [007] 16947.079361: hrtimer_interrupt
     ksoftirqd/7-81      [007] 16947.079361: raise_hrtimer_softirq.
     ksoftirqd/7-81      [007] 16947.079364: ksoftirqd_should_run 0
          <idle>-0       [007] 16947.079375:
hrtimer_next_event_without: 9223372036854775807
          <idle>-0       [007] 16947.079376:
tick_nohz_get_sleep_length: 86.838 ms
          <idle>-0       [007] 16947.079378: enqueue_hrtimer

Is the problem perhaps that ksoftirqd_should_run() needs to check
local_timer_softirq_pending() created by Frederic's patch as well as
local_softirq_pending()?

For the moment, we find that replacing NO_HZ_COMMON=y and NO_HZ_IDLE=y
with NO_HZ_PERIODIC prevents the undesirable sleep-state transition,
but clearly this solution is not ideal.

Thanks for any suggestions,
Alison Chaiken
achaiken@xxxxxxxxxxx
Aurora Innovation




[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux